GEORGE DAVID BIRKHOFF 

COLLECTED MATHEMATICAL PAPERS 


VOLUME II 


AMERICAN MATHEMATICAL SOCIETY 

5S1 West 116th Street, New York City 


1950 




George David Birkhofi, 




EDITORIAL COMMITTEE 
D. V. Widder ( Chairman ) 

C. R. Adams 
R. E. Lanoer 
Marston Morse 
M. H. Stone 


fCKEO 



fc*./ 


TABLE OF CONTENTS 


VOLUME II 


Dynamics—continued 


Dynamical systems with two degrees of freedom. 1 

Sur la demonstration dirccte du dernier theoreme de Henri Poin¬ 
care par M. Dantzig.103 

Recent advances in dynamics.106 

Surface transformations and their dynamical applications.Ill 

Celestial mechanics.230 

An extension of Poincare’s last geometric theorem.252 

Stability e periodicita nella dinamica.267 


Dynamique—Sur la signification des equations canoniques dc la 


dynamique .279 

Uber gewisse Zentralbcwcgungen dynamischer Systemc.283 

Stability and the equations of dynamics.295 

On the periodic motions of dynamical systems. 333 


A remark on the dynamical role of Poincare’s last geometric 
theorem.„. 354 

• v ' 

Structure analysis of surface transformations (with P. A. Smith).360 

Une generalisation a n dimensions du dernier theoreme de geome¬ 
tric de Poincare.. 

Proof of a recurrence theorem for strongly transitive systems.398 

Proof of the ergodic theorem.. 4 Q 4 

A new criterion of stability.. 

Sur quelques courbes fermees remarquables.418 

Sur l’existence de regions d’instabilite en dynamique. 444 

Recent contributions to the ergodic theory (with B. O. ICoopman) ..462 
Sur le probleme restreint des trois corps. (Premier memoire).466 


V 























Generalized minimax principle in the calculus of variations (with 

M. R- Hestcnes). 506 

Nouvellcs recherches sur les systemes dynamiques.530 


Note sur la stabilite en dynamiques. 


.662 


Sur le probleme restreint dcs trois corps. (Second m^moire).668 

Some unsolved problems of theoretical dynamics.710 


What is the ergodic theorem?. 

Ciertas transformacioncs en la dinamica sin elementos periodicos 


713 


(with J. Lifshitz) 


.718 


Physical Theories 


Books on relativity. 

A theory of matter and electricity. 

The hydrogen atom and the Balmer formula. 

A mathematical critique of some physical theories 


730 

737 

742 

747 


Newton’s philosophy of gravitation with special reference to modern 
relativity idea. 

Einigc Probleme der Dynamik. 

Probability and physical systems. 

Some remarks concerning Schrodinger's wave equation. 


764 

778 

794 

813 


On the periodic motions near a given periodic motion of a dynam¬ 


ical system (with D. C. Lewis).820 

Quantum mechanics and asymptotic series.837 

The foundation of quantum mechanics...:.857 

Electricity as a fluid .876 

Sir Joseph Larmor and modern mathematical physics.887 

The mathematical nature of physical theories.890 

Matter, electricity and gravitation in flat space-time.920 


vi 

























Newtonian and other forms of gravitational theory. I. Newtonian 
theory, II. Relativistic theories.^29 

El concepto matematico de tiempo y la gravitacion.944 

On BirkhofFs new theory of gravitation (with A. Barajas, C. Graef, 
and M. Vallarta).967 

Flat space-time and gravitation.973 








REFERENCE books 

Reprinted from Trans . Amer. Math. Soc. f April, 1917, Vol. 18, pp. 199- 
300. 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM* 

BT 

GEORGE D. BIRKHOFF 

1 . Introduction. Dynamical systems with two degrees of freedom consti¬ 
tute the simplest type of non-integrable dynamical problems and possess a 
very high degree of mathematical interest. Considerable light has been 
thrown upon the nature of such systems by the researches of Hill, Poincar6, 
Hadamard, Levi-Civita, and others. The principal advances which I have 
been able to make are here assembled into a general treatment of dynamical 
problems of this kind. 

Part I deals with the formal properties of the equations of motion. These 
equations are taken in the variational form due to Lagrange, with the principal 
function quadratic in the velocities. Six arbitrary functions of two variables 
are involved in this function. By means of a suitable change of variables a 
normal form of the equations is derived which contains only two arbitrary 
functions. This form is well known in the reversible case, i. e., the case when 
linear terms in the velocities are lacking in the principal function, but appears 
to be new in the general irreversible case despite its extreme simplicity. With 
its aid I obtain a new integrable type of the equations of motion, and derive 
an elegant form of the equations of displacement. 

It is known that in the reversible case the equations of motion can always 
be interpreted as those of a particle constrained to move on a fixed smooth 
surface. In order to obtain a clear insight into the nature of the irreversible 
case I have regarded it as important to obtain a simple dynamical interpreta¬ 
tion. It is proved that the motions may be looked upon as the orbits of a 
particle constrained to move on a smooth surface which rotates about a fixed 
axis with uniform angular velocity and which carries with it a conservative 
field of force. 

It is thus legitimate in all cases to interpret the motion as the orbit of a 
particle, and this is done throughout the paper. 

In Part II attention is turned to methods by which the existence of peri¬ 
odic orbits may be directly inferred. 

The existen ce of such orbits in the reversible case, when the orbits are 
• Presented to the Society, December 27, 1915 and September 4, 1916. 

199 


1 



200 


O. D. BIRKHOFF 


[April 


interpretable as geodesics on a surface, is intuitively manifest. Under proper 
conditions a closed geodesic will exist along which the arc length is a minimum, 
and this geodesic will correspond to a periodic orbit. In connection with 
the rigorous development of this minimum method may be cited important 
papers by Hadamard* and Hilbert, f 

A further suggestive criterion for-periodic orbits in the reversible case has 
been given by Whittaker.* More recently this work has been placed upon a 
rigorous basis and extended to multiply connected regions by Signorini.§ 
When we employ the geodesic interpretation, Whittaker’s criterion may be 
formulated as follows: If the two boundaries of a ring on the surface have 
everywhere positive geodesic curvature toward the inner normal, there will 
exist a closed geodesic which makes a single circuit of the ring. 

It is intuitively manifest that the curve of minimum length around the 
ring furnishes such a geodesic. 

My treatment of the periodic orbits begins with the reversible case in 
which I make an immediate extension of the results of Hadamard, Whittaker, 
and Signorini. 

The irreversible case is of an entirely different nature. The integrand of 
the integral replacing arc length is no longer of one sign. Notwithstanding 
this salient difference, Whittaker has given the direct formal extension of his 
criterion to a particular irreversible problem (the restricted problem of three 
bodies) without the least modification of his earlier discussion. || 

By a somewhat elaborate argument I have been able to show that an ex¬ 
tension of this sort is legitimate provided a further inequality (holding in the 
restricted problem of three bodies) obtains. An example is constructed to 
establish the necessity for such an inequality. 

The inherent limitation of the minimum method is that it can only yield 
the completely unstable periodic orbits.H 
The minimax method of Part II, which is applied only to the reversible 
case, yields a large and entirely different class of periodic orbits. This new 
method may be formulated in a special case as follows: There is a minimum 
length of closed string, constrained to lie in a given closed surface of genus 0, 

•Journal de mathi matiques, ser. 5, vol. 4 (1898), pp. 27-73. 
f Jahreeberich t der Deutschen M athematikcr-Vereinigu ng, 
vol. 8 (1900), pp. 184-188. 

X Sec hia Analytical Dynamics (Cambridge, England, 1904), pp. 376-378. 

JRendiconti del Circolo Matematico di Palermo, vol. 33 (1912), 
pp. 187-193. In this connection reference should also be made to a paper by Tonelli in the 
same journal, vol. 32 (1911), pp. 297-337. 

|| Monthly Notices of the Royal Astronomical Society, vol. 62 
(1901-1902), pp. 346-352. 

H See Poincare, Let milhodct nouielUa dc la micaniquc alette, vol. 3 (Paris, 1899), pp. 283- 
293. 


2 



1917] 


DYNAMICAL 8Y8TEM8 WITH TWO DEGREES OF FREEDOM 


201 


which may be slipped over that surface; in some intermediate position the 
string will be taut and will then coincide with a closed geodesic.* 

The third method, applicable to all types of periodic orbits, is the method 
of analytic continuation of Hill and Poincare. Application of this simple 
method has been so far limited by the restriction that the variation of the 
parameter involved be “ sufficiently small.” This restriction is necessary 
on account of the possibility that the period of an orbit under consideration 
may become infinite. I have succeeded in showing that this possibility does 
not arise in certain classes of reversible and irreversible problems. 

A vital application of the periodic orbits lies in the construction of surfaces 
of section, considered in Part III. The dynamical problem is thereby reduced 
to a transformation T of the surface of section into itself. A reduction of 
this kind was effected by PoincarSf in the restricted problem of three bodies, 
where he found a ring-shaped surface of section. The results of Part III 
establish the existence of such surfaces in a wide range of cases, and of varying 
genus and number of boundaries. 

On account of the fact that the transformation of the surface of section 
possesses an invariant area integral, this transformation involves only one 
arbitrary function of two variables. If it be recalled that the normal form 
of the equations of motion involved two such functions, the analytic importance 
of the reduction becomes clear. A fundamental and unanswered question is 
whether the transformations derived from dynamical problems arc the most 
general ones which have an invariant area integral. 

The essential properties of an orbit are mirrored in corresponding properties 
of the transformation T. Thus invariant points of the transformation and 
its iterations correspond to periodic orbits. 

Part IV contains two theorems on the invariant points of such transforma¬ 
tions. The first of these yields the result that the difference between the 
number of unstable orbits and stable orbits is a constant depending only on 
the general nature of the transformation. For the case of genus 0 this theorem 
stands in close relation to a well known theorem of Brouwer.J The second 
theorem is based on a modification of Poincare’s last geometric theorem.§ 
Poincare showed that for the case of a ring-shaped surface of section the 
truth of his theorem implied the existence of infinitely many periodic orbits. 
It is not diffic ult to use his theorem to establish the same fact for the general 


he existence of a closed geodesic on a convex surface was proved by Poincar*, these 

* q C , 1 ° “ 8 ' V ° ' (1905) ’ PP ‘ 237 “ 274 ’ by entirely different means. 

♦ ™ L€3 TfUlhodes nouiflUs d* la mlcanxqne ctUsle, vol. 3, chap. 33. 

Mathematische Annalen, vol. 69 (1910), pp. 176-180. 

DD 5 ^t4^ ,C Fo nt ‘ de, f C /k C01 i° Malemalico di Palermo, vol. 33 (1912), 
pp. 14 ^' a P em ** th6S€ Transactions, vol. 14 (1913), 


3 



202 


O. D. BIRKHOFF 


[April 


case of genus 0. The modified theorem leads to the conclusion that the 

same is true when the surface of section is not of genus 0. 

In a later paper I expect to make a general study of transformations of 
surfaces into themselves, such as are afforded by the transformations T, 
and I reserve for that paper the consideration of non-periodic orbits. 

Part I. The equations of motion 
2. Redaction to a normal form. Let t denote the time, let the variables 
* and y denote the two coordinates of the dynamical system under considera¬ 
tion, and let x', y' denote their respective time derivatives. 

The equations of motion will be taken in the Lagrangian form 

(1) ±L,--I., = 6, ±L,--L,- 0, 

where L is a given quadratic function of x', y', namely 

(2) L = i [ax' 7 + 2 bx’ y' + cy A ) + ax' + 0y' + 7, 

and where L xt L y , L x ‘ , £*', represent the partial derivatives of L in the 
respective variables X , y , *' , y ' . The two equations are of the second order, 
so that their general solution depends on four arbitrary constants. 

It will be assumed that a , 6, c , a, 0 , y are real analytic functions of x and y , 
and that the inequalities 

(3) a > 0, ac - b 7 > 0 

are satisfied. These restrictions are met in the important cases. Wc shall 
call any particular surface, the square of whose element of arc is 

ds 2 = adz 2 + 2bdxdy + cdy*, 

the characteristic surface. As t varies the point (x t y) describes an orbit on 
this surface or in the xy-plane. 

The equations (1) admit the familiar integral 

x' + = £ + *•* 

We shall restrict attention to the solutions of (1) for which the constant k 
has a given value. Thus the totality of orbits under consideration will depend 
on only three arbitrary constants. Since the equations of motion are not 
altered if L is increased by a constant it will be no limitation to choose the 
constant equal to zero. When this is done the explicit form of the integral 
becomes 

(4) Uax*+ 2bx-y’+ cy'] = y. 

• See Whittaker, Analytical Dynamics, p. 61. 


4 



1917] 


DYNAMICAL 8YSTEM8 WITH TWO DEGREES OF FREEDOM 


203 


We are therefore restricting attention to those orbits for which the velocity 
on the characteristic surface is V2y. These lie wholly in the regions 7 = 0, 
bounded by the ovals of zero velocity 7 = 0 . 

A well known equivalent form for (1) is bJ = 0 where we write 

(5) J = f'Ldt, 

and where 5 is the customary variation symbol for the case of fixed end-points.* 
In fact, the equations (I) are precisely the Euler equations obtained when bJ 
is equated to zero. 

If one makes use of (4) the integral (5) may be given the form 

(6) J * ” X'HW + 2&e V + + ca’ + fiy'l dt. 

Consequently the variation 6J * will vanish for variations of x and y subject 
to (4), provided only that the initial values of x and y satisfy (1). Moreover 
we have identically 

(7) J -J* m if (Vo*'* + 2 by + cy A - V*?)* dt, 

so that along any initial curve for which (4) holds we have bJ - bJ* = 0. 
It follows that bJ* must vanish for unrestricted variation of x and y if the 
initial curve in the ary-plane is an orbit in the dynamical problems (1), (4). 

The integrand of J* is positively homogeneous of dimension unity in x', y'. 
Consequently the value of J* is independent of the parameter t used along the 
path of integration in the xy- plane, and the equation (4) can be regarded 
as merely determining the parameter along the path. 

If we have bJ * = 0 along a curve, and if the parameter t is so chosen that 
(4) holds, we have bJ = bJ* = 0 along the curve. 

Accordingly, if bJ* vanishes along a curve, and t is properly chosen, that 
curve will be an orbit in the dynamical problem (1), (4). 

The equation bJ * = 0 constitutes the principle of least action for our problem, 
and is familiar in the case <* = /? = 0. By means of this principle the vari¬ 
ables a:, y, t may be transformed with facility. 

In fact the condition bJ* = 0 is invariant in form under a transformation 
of dependent variables from x t y to x, j/. Thus along the transformed orbit 
the same variational condition will be satisfied, save that L is replaced by its 
expression in terms of the new variables, while t has the same meaning as 
before. Consequently, in order to transform these variables, it is sufficient to 
effect the tr ansformation of L directly. The corresponding transformed 
. this connec tion see Bolza, VorUsungen liber Varialionsrechnung (Leipzig, 1909), pp. 


5 



204 G. D. BIRKHOFF [April 

equations (1), (4) are then obtained by the use of this new form for L. The 
same fact may be deduced from the condition 8J = 0. 

We may also determine the modification which (1), (4) undergoes as a result 
of a transformation dt = y(x, y)di of the independent variable. We note 
that the integral «/* may equally well be written 

(8) J* = f [_ ^^vp*' 7 + y 1 + j; y' 7 + + W ] dt - 

This modified integral is of the same form as before but evidently corresponds 
to a value L in which a, b, c, a, 0, y have been modified to a/y, b/y, c/y, 
a, 0,yy respectively. The variables x , y are unaltered and of course we have 
8J * = 0 along the same orbits as before. But a comparison of the original 
and modified equations (4) shows that the relation dt = ydt obtains between 
the new and old parameters along the orbit. By this transformation of t 
then, the equations (1), (4) go over into other equations of the same type 
with a principal function L equal to y times its given value. 

The differential form Ldt is invariant under transformations of either type. 
We conclude therefore: 

By a transformation of variables of the form 

(9) *-*(*,$), dt - y(£,y)dt, 

the equations (1), (4) go over into similar equations in which the corresponding 
L is obtained from the formula Ldt - Ldt. 

If ds is the element of arc on the characteristic surface, the part of Ldt 
quadratic in x ', y' may be written ds 2 /dt. Under a transformation (9) this is 
evidently carried over into the corresponding part ds 2 /dt of Ldt. By choosing 
£, H to be the coordinates of an isothermal net on the characteristic surface 
the squared element of arc takes the form y ( dx 1 + dy* ). Hence if we take 
dt = ydt, with y the same as in the element of arc, the new quadratic terms 
have the simple form j (z' 2 + y' 2 ). 

We are thus led to the following conclusion: 

For given Lagrangian equations ( 1 ) joined with the integral condition ( 4 ), 
there exists a transformation (9) of the variables x,y,t such that the function I. 
for the transformed equations may be written 

< 10 > 5 (*' 2 + y '*) + ax' + 0y f + y. 

The new equations and integral condition are then 

d') X"+ \y' = y,, 3," _ X*' = (X- or,-0,1 


6 



DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


205 


1917J 


The advantage of the normal form (1'), (4') is that it involves only two arbi¬ 
trary functions of x and y , namely X and y, whereas the original form involved 
six such functions. 

According to the terminology used in the introduction, we shall term a 
dynamical problem in which X vanishes identically a reversible problem. This 
is the case when (1')* (4') are not altered if t be replaced by — t; here the orbits 
may be described in either sense. When X does not vanish identically we will 
term the problem irreversible. 

In the reversible case the linear terms of L in x', y' may be taken as lacking. 
The equations (1'), (4') become those of a particle of unit mass and rectangular 
coordinates x, y, which moves in a conservative field of force derived from a 
potential function — y . This normal form is well known in the reversible 
case,* but I have not found anywhere the simple extension given above to 
the general case.f 

3. Transformation of the normal form. The transformations (9) of x , y , t 
used in § 2 form a group, and the reduction to the normal form there given 
is not unique. In this way the question arises: Under what subgroup does 
the normal form remain invariant? The answer is contained in the following 
statement: 

A transformation of variables 


(11) * ±iy -f(*=kiy), = |/'(i ±iy)\ 2 dl, 

where f (z) is analytic in z, and f' (2) denotes the derivative of f{z), leaves 
(1'), (40 unaltered inform, withX, y replaced by ± X |/' | 2 , y\f'\ 2 respectively. f 
We will proceed to establish this statement by an indirect method. 

It was observed in § 2 that the most general variables x, y for the normal 
form (10, (40 corresponded to any isothermal net on the characteristic surface. 
Hence the possible transformations from x, y to £, i/ which preserve the 
normal form are the conformal and anti-conformal transformations specified 
in the italicized statement. The corresponding transformation of t (see § 2) 
is then that stated, inasmuch as we have dx 2 + dy 2 = \f'\ 2 (dx 2 + dy 2 ). 

The general principle of transformation enunciated in § 2 shows at once 
that we have 


“-ctXi + fly,, + y _ \f' 2 \y. 

Thus y has the stated value. 

(Pa^n'9°5)'pp m « e ^9l b0, “' Ltt ° n ’ ‘ Ur ** >,UOnt ** SUr/ °" 5 ’ V01 ' 2 ’ 6eC ° nd edUi0n 

11 have employed these equations incidentally; see Rendiconti del Circolo 
Matematico di Palermo, vol. 39 (1915), pp. 271 - 273 . 
t A direct proof was given by me. See reference above. 

Tr*na. Aw. Mutb.Soc. 14 


7 



206 


G. D. BIRKHOFF 


[April 


The function X is of course o„ - 0. by definition. To obtain the explicit 
form of X we resort to a device. By Green’s theorem we have 

J' J (a„ - 0 ,)dxdy = ~ f B ( - adx + • 

Jf (a,- 0 ,)dxdy -+ 0 dff) ’ 

where S and S denote any two corresponding continua in the xy- and the 
fS-planes respectively, and where B and B denote the complete boundar.es 

of S and S taken in the positive sense. _ 

But we see at once from the explicit formulas for a and 0 that we have 


j* ( adx + 0 dy) * j* ( adx + (3dy) 


Thus for any continua 5 and S the equality 

f f lc,-f>,)dxdy - f f (a,-B,)d£dS 

holds. 

If now we express x, y in the first double integral in terms of £, ff, there is 
obtained 


f_f XLf'I’dfdC = f j \dsdg 


for an arbitrary continuum S. Hence X is equal to =fc X \f | 2 where the 4- or 
sign is to be taken according as the transformation from x, y to £, y preserves 


or reverses sense. 

The italicized statement yields nothing new in the reversible case.* 

4. A new integrable case. In the reversible case it is known that when, 
after a proper preliminary transformation of variables, the function y > n 
equation (T) reduces to the sum of a function of x and a function of y , the 
equations of motion will be integrable. In fact x' and y' are integrating 
factors of the first and second equations (T) under these circumstances. A 
famous problem of this sort is that afforded by a particle which moves in a 
plane attracted by two fixed particles in that plane according to the new- 
tonian law. 

The above case is essentially the most general reversible case in which 
there exists a quadratic integral. 

I propose to make application of the results of §§ 2, 3 to discuss a new 

• See Darboux, loc. cit., or Kasncr, Diffcrential-ijeomelric Aspects of Dynamics, Princeton 
Colloquium Lectures (New York, 1913), in particular pp. 81-87. 


8 



1917] DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 207 

integrate case. This case is the most general irreversible case in which there 
exists an integral linear in x ', y ', 

lx' + my' + n = const., 

which holds for all solutions of (1), (4). 

Such an integral maintains its form under the most general transformation 
(9). Hence we may assume that the equations of motion are taken in the 

normal form (10* (40- . , , 

If the linear integral be differentiated as to <, the equation which results 
must be an identity in virtue of (1'), (4')- The equations (T) may be em¬ 
ployed to eliminate x", y". When this has been done an equation quadratic 
in x ', y' is obtained which must be an identity in virtue of (40 alone, i hese 
quadratic terms are 

+ (ly + m s )x' y' + n y y' . 

In order that these terms shall combine with those of lower degree in x', y' 
by the use of (40, they must be of the form p(ar ,J + 2 /'*). This implies 
lx - my, ly - - m x , i. e., that l - u 9 , m - v gl where u is a harmonic func¬ 
tion. The integral can now be written 

u v x' + w, y' -f n = const. 

According to the principles of § 3, a further arbitrary conformal trans¬ 
formation of the ary-plane, joined with the appropriate change of the vari¬ 
able t, will leave (10, (40 in the normal form. In order to simplify further 
the linear integral we shall choose such a transformation of .r and y in a par¬ 
ticular way, namely 


( 12 ) 




Since the function u v -f- iu x is an analytic function of x 4- iy, the integral on 
the right will also be analytic in x + iy . Hence the inverse transformation 
x + iy = f (x + iy) is also conformal, and we have 


dx + idy 2 , , , 

te + idy 


\ f'{i + iy)Y 

so that the-transformed value of t is defined by 

dt = (ul + uDdl. 

From this last equation we find at once 

x' 4- iy' = (uy - iu x ) ( x' + iy '), 
where x' = dx/dt, y' = dy/dt. Thus we have in particular 

x' = x' + Vz y' • 


9 



208 


G. D. BIRKHOFF 


[April 


Consequently when such a further transformation of the :ry-plane has been 
effected, the above integral is simplified to 

x' + n = const. 

Now let this integral be differentiated as to t and let x" be eliminated by 
means of the first equation (1'). There results 

n, x' 4- ( n„ — X) y' + y* * 0, 

which must vanish identically in virtue of (4'). Therefore we conclude that 
the left-hand member vanishes identically in x ', y'. But this will happen 
if and only if X and y are functions of y alone, in which event a choice of n 
can be made so that the equation does obtain identically. We are led in this 
way to state the following result: # 

If a dynamical system (1), (4) admits of an integral linear in x' , y', it is 

possible by a suitable transformation 

x = <t>{£,y), y-*(*.*>. dt = p(£,y)dt 

to throw these equations into the normal form (l')» (4') in such wise that X and y 
become functions of y alone, and the linear integral takes the form 

(13) +/x</y « <?i. 


These equations of motion may be explicitly integrated: 


(14) 


r (ci - f\dy)dy + ^ 
^ y}2y - (ci -fxdy) 1 

= f dy 


- (d -fMy)* 


+ Cl. 


In fact, we have for n the value f\dy so that x' may be expressed as Cj — /Xdy 
by means of the linear integral. Also y' may be similarly expressed by the aid 
of (4 # ). This last relation, by a further integration, yields the value for t in 
terms of y as given in the second equation (14). By substitution of the value 
of dt in the linear integral and an integration, we get the first equation. 

It has been assumed that the transformation has been effected which re¬ 
duces the equations of motion to a normal form in which X and y are functions 
of y only. A method of doing this is afforded by the following criterion, at 
least if the original equations are in normal form: 

If the equations of a dynamical system (1'), (40# are of this integrable type, 
the curves \/y = const, form one family of an isothermal net in the xy-plane. 


10 



DYNAMICAL 


209 


1917] 


SYSTEMS WITH TWO DEGREES OF FREEDOM 


When this net is transformed into the net x = const., y = const, by a conformal 
transformation of the xy-plane such that the cures, \/y = const, go over uUo 
the curves y = const., then X/7 becomes a function of y alone. If , further, X and, y 
are each functions of y alone the resultant equations can be integrated as above, 


and not otherwise. 

To see the truth of these statements, it is necessary to observe that the 
ratio X/7 is unaltered by a conformal transformation of the xy-plane. Indeed 
the result of such a transformation has been seen in § 3 to multiply X and y 
by the same factor. But in the final form of the equations of motion X/7 
is a function of y alone. Since the final form was obtained from the given 
form by a conformal transformation it follows that the curves X/7 = const, 
form one family of an isothermal net in the original plane, at least if the given 
equations (10# (40 possess a linear integral. 

Moreover the curves X/7 - const, can only form a family of one such iso¬ 
thermal net. If then we make the conformal transformation which takes this 
net into the net x - const., y - const, as stated, it is clear that X/7 becomes a 
function of y alone, and that X and 7 must now become separately functions 
of y alone if the equations are to belong to this integrate type.* 

We now propose to assume that the given equations are in the general form 
(1), (4), and we are led to the following result: 

If the equations of a dynamical system (1), (4) are of this integrable type, the 


curves 

(15) 


7 Vac — b* 


const. 


form one family of an isothermal net on the characteristic surface. When this 
net is chosen as the net x = const., y = const., so that the family first specified 
goes over into y = const., the equations of motion take the normal form (l')» (4') 
and X/7 becomes a function of y alone. If, in addition, X and 7 are each func¬ 
tions of y alone, the resulting equations can be integrated as above, and not otherwise. 

Let us make the corresponding transformation (9) of the variables in the 
given equations, which takes them into the directly integrable normal form. 
The curves x - const., y - const, will then form an isothermal net on the 
characteristic surface. If we let stand for the jacobian of the transforma¬ 
tion from x, y to x, y, the principle of transformation of variables given in § 2 
shows that we have 


X = tf \, y = yy = J vac — b 2 y. 

The first relation may be derived precisely as the analogous formula X = | 2 X 

(which is indeed a special case) was derived in § 3. To establish the second 
• This method will not apply if \/y reduces identically to a constant. 


11 



210 


G. D. BIRKHOFF 


[April 


we note that the transformation of x and y alone reduces the quadratic terms 
of L to the form __ 

so that we must take y = cf Vac — b\ in the subsequent transformation 

dt = ydt of t. . 

Thus the family of curves specified in the statement goes over into the fam¬ 
ily = CO nst. and so coincides with the isothermal family y = const., 
since in the directly integrate case X/7 is a function of y alone. The first 
part of the statement is therefore proved. The second part is obvious. 

The above integrate class of equations can be obtained from an entirely 
different point of view, namely as the most general class of equations (1), 
(4) which admit of a continuous group of transformations (9) into themselves. 

5. The equations of displacement. As a second illustration of the methods 
introduced in §§ 2, 3, we proceed to derive the equations of displacement 
for a system (1), (4). As far as I am aware these equations have not hitherto 
been obtained in the abbreviated normal form which I give.* 

By a properly chosen transformation of the variables x , y we may make the 
equation of the given orbit on the characteristic surface become n = 0 in 
the new variables s and n, and in such a way that 3 measures arc lengths along 
the orbit. The meaning of this transformation when interpreted on the 
characteristic surface is that the orbit is taken as a base curve of one family 
of an isothermal net while the orthogonal family has for its parameter the 
arc length to its point of intersection with the orbit. Hence, on account of 
the known properties of such an isothermal net, the variable n will measure 
normal displacements away from the orbits (at least if these are small) just 
as 3 measures displacement along the orbit. 

We shall let t denote th^time corresponding to the variables s, n instead 
of to x, y. If, however, the equations are given in the normal form (1'), (4'), 
we can pass by a conformal transformation from x, y to 3, n. In this case 
we will have |/'| = 1 along the orbit (see § 3) so that the two times agree along 
the orbit. 

In the new variables 3, n the equations of motion are in the normal form 
(1')* (4') and have the particular solution 3 = s 0 (t), n = 0. 

Let us denote the particular values of X and 7 in these equations.(1'), (4') 
by X and 7 respectively. If we substitute the particular solution furnished 
by the orbit in the equations we get the relations 

80 = 7«(*o, 0), - X(5o, 0)$o = 7n(*o, 0), So = V 27 (s 0 , 0) . 

• See Poincar6, Lea rrUlhodea nouvellea de la mteanique Uleale, vol. 3, pp. 280-283, and also 
my paper, Rendiconti del Circolo Matematico di Palermo, vol. 39 
(1915), pp. 273-275. 


12 



DYNAMICAL 


SYSTEMS WITH TWO DEGREES OF FREEDOM 


211 


1917] 


If now we consider a slightly modified solution « = *. + d*. » = «*». 
and allow « to approach zero, S, and Sn will approach a solution of the equa¬ 
tions of displacement + W # in , 


5 n " - \ 8 s' - ^ 2 y [\, 5s + K 5n) = y. n 5s + y nn 5n, 

il2y5s' = y, 5s + y n 5n , 

which are deduced from (10. (40 by the usual method of variation. Here 
use has been made of the last relation noted to eliminate s ' 0 . 

The first of these three equations in 5s, 5n can be derived from the last by 
differentiation as to t and subsequent division by V2y. Such an interrelation 
is to be looked for since the equation (40 is not independent of the two equa¬ 
tions (10 but is derivable from them by an integration. 

Moreover, by use of the last of the three displacement equations, we may 
eliminate 5s' from the second equation. The quantity 5s will disappear at 
the same time. In this way we are led to the following conclusion: 

If s and n denote displacement along and normal to a given orbit on the char¬ 
acteristic surface, the differential equations of displacement may be written 


( 16 ) 5n" + I5n = 0, 5s' - -^=7. 5s * X Sn , 

where _ _ 

(17) /-X*-X.V 2y-y„. 

Here X and y denote the value of X and y respectively in the equations (T). (40. 
with the isothermal variables s, n so chosen that s represents arc length along 
the orbit n - 0. 

The first of these two equations is a linear differential equation of the 
second order in n alone, and is the equation of normal displacement. It plays 
an important role in the sequel. The quantity / may be explicitly computed 
in terms of the original variables x, y, and the corresponding functions a,b ,c, 
a, 0, y. 

6. Two equivalents of the normal form. Let us introduce the auxiliary 
variable 

(18) <t> = arc tan j #> 


so that <f> indicates the angle which the direction of motion in the xy-plane 
makes with the x-axis. We have then at once the three equations 


(19) 


x' = V2*y cos <f> = X(x,y,<t >), 
y' = V27 sin 0 = Y (x, y, <t>), 


> , — 7x sin <t> + y v cos <f> 
" x+ ^ 


*(z,2/.0). 


13 



212 


G. D. BIRKHOFF 


[April 


The first pair of equations result from the equation (4') and the fact that 4> 
has the stated significance. The last equation may be deduced by forming <*>' 
and eliminating x, y, x', y’ by means of the equations (D and the first two 
equations (19). The system of equations (19) is of the third order and 
equivalent to (l')» (4')- 

If we eliminate the variable t, the equations (19) reduce to the second 
order and become 

dx _ dy _ d<t> 

X “ Y " 4> * 


Thus we are led to a hvdrodynamical interpretation of the totality of orbits 
under consideration. Equations (19) are evidently the equations of a fluid 
in steady motion. If x, y, <t> be thought of as the rectangular coordinates 
of a point, this fluid is incompressible in virtue of the identity 

X x + Y y + 4>, - 0. 

That is, the triple integral fffdsdyd<t> is invariable when taken over any 
given part of the fluid. 

This first equivalent form of the equations depends upon the variable t. 
A second such form, which gives in a single equation the characteristic geo¬ 
metrical property of the orbits taken in the xy-plane, is obtained by con¬ 
sidering the curvature K = d*t>/ds in that plane. This intrinsic equation 


( 20 ) 


K 


X - 7x sin <t> -f 7v cos <t> 

2 7 


results at once from the equations (19). Conversely, we may pass back 
from (20) to (19) by introducing the variable 


= r_*_^ r 

J V2y cos <t> J 


dy 


7 cos <f> J V27 sin <f> 


Both (19) and (20) play an important part later.* 

7. A dynamical interpretation. We propose now to obtain a dynamical 
interpretation of simple character for the equations of motion. As was 
observed in the introduction, such an interpretation by means of geodesics 
is known in the reversible case.f In fact, in this case the integral J * is pre¬ 
cisely the arc length on a certain surface, so that the variation of J* is zero 
along the geodesics. 

In this interpretation the integral J * has been made use of instead of the 
integral J . Thus the totality of orbits which are simultaneously interpreted 
as geodesics is the totality given by (1), (4), and not the totality of solutions 

• Compare with §§ 1, 2 of my paper in the R e n d i c o n t i above cited. 

f See Whittaker, Analytical Dynamics, pp. 249-250. 


14 



1917] 


213 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 
of the dynamical problem afforded by (1). However, the following result is 

^hlheoue of a reversible dynamical -problem (1) the variables x, y may be 
regarded as the coordinates of a mass particle which is constrained to move on 
the characteristic surface in a field of force of potential — y • 

For let u, v, w denote the rectangular coordinates of the particle. The 
components of normal force which operate to hold the particle in the surface 
are pi, pm, pn, in the directions of the respective axes, where l;m,n are the 
direction cosines of the normal, and where p is a suitable multiplier. The 
components of force due to the field of force are y u , y v , y*. Hence the 
equations of motion in this problem are 

u" - pi - 7« - 0, v " - pm - y, = 0, to " - pn - y w = 0. . 

The multiplier p is determined by the fact that the particle is to lie in the 
surface. 

Now let 8 u, 8 v, Sw be functions of t which are arbitrary save that at every 
instant they are proportional to a possible displacement of the particle in 
the surface; the particle is assumed to be moving along an orbit of course. 
This imposes precisely the condition 

I 8 u + mh v + n 8 w = 0. 


Multiply the three above equations of motion by 8 u, 8 v, 8 w , add, and inte¬ 
grate. We find 



-yu) 6 u + ( 9 "-y,)to + (w" 


y„) 8 w]dt = 0. 


But the integral on the left is precisely the variation of the integral 

- f'UW' + *'' + *>'') + y]dt 


which is the same as — J, since du 2 + dv 1 + dvfi is’the square of the element 
of arc on the characteristic surface and since a and 0 are zero. Also any 
variation of u t v, w, in the surface is admitted. Thus the orbits are given 
by the condition 8 J = 0, which proves our statement. 

We have included this obvious discussion by standard Lagrangian methods 
because it facilitates the derivation of our result for the irreversible case: 

In the case of an irreversible dynamical problem (1), (4), the variables x , y 
may be regarded as the coordinates of a mass particle moving on a surface, while 
that surface rotates at a uniform rate about a fixed axis and carries with it a 
fixed conservative field of force. 

We begin by deriving the equations of motion for a problem of this type. 


15 



214 


O. D. BIRKHOFF 


[April 


Let 17, f denote the rectangular coordinates of the particle on the surface 
when referred to axes fixed in space, with the f-axis chosen as the axis of 
rotation. Also, let w, v , w denote the coordinates of the same particle referred 
to axes fixed in the body and coincident with the £, v • f axes at t = 0. If 
the angular velocity of rotation is taken to be unity, we have the following 
obvious relations: 

£ = u cos t — v sin t, v = u sin t + v cos t, f = ic. 

The components of the force due to the conservative field at t = 0 in the 
direction of the £, *?, f axes are S M , S., S„ where S(u, v, w) is the negative 
of the potential of the field of force moving with the surface. 

The components of normal force which operate to hold the particle in the 
surface are pi, pm, pn respectively where l, m, n are the direction cosines of 
the normal, and where p is a suitable multiplier. 

Finally, if we differentiate the above equations twice as to t and put t = 0, 
we find the £,»?,£ accelerations of the particle at t - 0 to be respectively 

u" - 2v'-v, t>" + 2 u'-r, to". 

Thus we have at t - 0, and similarly at any other time, 

u ff - 2t>' - u - pi - S u = 0, 

1" + 2 u' — v — pm — =» 0 , 

w" — pn — S* = 0 . 

In these equations, only the relative coordinates of the particle appear. The 
multiplier p is determined by the fact that the particle is to lie in the surface. 

Now let 6 u , 6 c, 6 w be defined precisely as in the reversible case. Multiply 
the three equations of motion by 6 u, Sv, 610 respectively, add, and integrate. 
We find 



— 2®' — u — Su*)*u + (r" + 2u‘ - v - S.)8v 

+ iw" 


S*)6iv)dU = 0. 


The integral on the left is the negative of the variation of the integral 

F = [§(«'* + S + u>'*) + (ru' - uv') + Si)dl, 

where Si = S + $( u 2 + r 2 ). As before any variation of u, v, w in the surface 
is admitted. 

By expressing u, v, w explicitly in terms of variables x and y taken as 
coordinates of the particle, it becomes clear that the integral F is of the same 


16 



1917] DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 215 

form as J, so that a dynamical problem of this type is of the kind we have 
been considering.* What we wish to prove is that for a given problem fur¬ 
nished by an assigned set of equations (1), (4) there will be a corresponding 
value of F which leads to the same totality of orbits. 

Let us assume that we employ isothermal variables on the characteristic 
surface of this given problem (1). The integral J * may then be written 

JP ( V2Wm(*' ! + y' 1 ) + ax' + W ] dl . 

which differs from 

+ /*)/p + (« + *„)*' + (P + e,)y']dt 

only in the perfect differential under the integral sign. The orbits will 
not be altered by the introduction of a complete differential under the integral 
sign in J * since the variation is not thereby affected. Thus we conclude that, 
if the integrand L of J be given the form 

(*'* + y A ) + (.a + e,)x' + (0 + e y )y' + py, 

Z p 

which involves the two arbitrary functions p and 0 , the orbits are unaltered. 
The significance of the time has been changed. 

Hence if we can choose p, 0, u , v, w, Si , so that the identity 

i (u A + a' 1 + W A ) + (»»' - ««') + S, 

- \ ~ p (x A + Jr'’) + (a + e,)x' + (0 + 6 v )i + py 

holds, the integrand of F will become the same as that of the modified J , and 
the italicized statement will be established; the value of 5 is Si — \ (V 2 + v ' 2 ) . 
Of course we write 

u' = u x x f + y ', c # = ®, ar' + v y y ’, w' = w x x' + w v y ', 

so that the conditions to be satisfied are six in number. 


u\ + vl + wl = u\ + v\ -f w\ = -, u x u v + v x v v + w x w v = 0, 

P 

vu x — uv x - a + 0 Xt vu v - uv v = 0 + 0y, Si = py f 

and involve six unknowns p , 0 , u , v , w , Si. 

If the first two members of the continued equality of the first line are equal, 
• See Whittaker, Analytical Dynamics, pp. 39-41. 


17 



216 


O. D. BIRKHOFF 


[April 


their common value yields /i/p and so p. The last equation of the second 
line then determines Si. Furthermore the first two equations of the second 
line may be regarded as a pair of simultaneous differential equations for 0, 
from which a value of 0 unique up to an additive constant can be obtained if 
the two equations are compatible. We are thus led to the three simultaneous 
partial differential equations for u, v, w: 

11* + v] + V}\ = u\ + v\ + w\, U x Uy + V x V„ + W x Wy = 0, 

2 (ti, Vy — V X Uy) = Cty — 0 X = X . 

If these equations admit of solution, the six earlier equations can be satisfied 
by a proper choice of p, 0, and Si. 

Since it is always possible to choose a real set of values of u x , v x , w xt u v , 
Vy, Wy (for any given value of X) which satisfies these equations, and is such 
that the jacobian of the left-hand members in v x ,v x ,w x is not zero, it follows 
that a real analytic solution exists. 

It is worthy of note that the above interpretation involves only two arbi¬ 
trary functions of x and y, namely the functions which define the surface 
and the potential of the forces on that surface. 

A second interpretation of less interest from a dynamical point of view is 
immediately suggested by the normal form (1'). These equations are evi¬ 
dently those of a mass particle, electrically charged, which moves in a plane 
subject to an electric field derived from a potential proportional to y, and 
subject to a normal magnetic field of strength proportional to X. 

Part II. Direct criteria for periodic orbits 

8 . Concave boundaries in the reversible case. Consider any boundary 
of a continuum C on the characteristic surface in a reversible problem. Let 
P and Q be any pair of points of that boundary which form the extremities 
of some interior rectifiable arc PQ of length less than d (d small). If then the 
region limited by such an arc PQ and the unique short orbital arc with the 
same two end-points never contains boundary points within it, the boundary 
will be termed concave.* 

If a boundary is made up of a finite number of arcs with continuous curva¬ 
ture, forming a simply closed curve, the condition for concavity is that the 
interior anglesf at the vertices are less than tt , and that at every other point 
the curvature towards the interior is not less than that of the tangent orbit. 
Thus, if the orbits are straight lines in the plane, the boundary of any convex 
curvilinear polygon is concave with respect to the interior continuum. 

• Signorini uses the designation boundary of Whittaker (contomo di Whittaker) for boundar¬ 
ies of a slightly more restricted type (loc. cit.). 

f That is, interior to C . 


18 



1917] 


217 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

We shall not pause to give a proof of this statement which does not enter 
into our later reasoning. In a very important special case, however we 
note that the statement holds, namely in the Case when the arcs are orbital 
arcs- for then the orbital arc joining any two nearby points of a boundary 
arc coincides with it, while the orbital arc joining two nearby points on oppo- 
site sides of a vertex is an interior arc. 

A second very important case of a new type is afforded by a boundary 
which is formed by any collection of complete orbits , and their limit points. 

Let us demonstrate that a boundary of this sort is concave. Suppose that 
p an( j q are points on it forming the extremities of a short rectifiable arc lying 
within the continuum. If the region limited by this inner arc and the unique 
short orbital arc joining P to Q contained a boundary point R of C within it, 
orbits of the set used to define the boundary could be found which came as 
near to R as desired. Nearby orbits to R will necessarily cut the boundary 
of the small region formed by the two short arcs at least twice. By definition 
it cannot cut the inner arc at all. But two short orbital arcs can intersect 
at most once. Thus we arrive at a contradiction if we assume that the small 
continuum includes a boundary point within it. Consequently the boundary 
formed by the set of complete orbits is concave. 

Any concave boundary r of a continuum C can be approached by another 
concave boundary in C made up of a set of orbital arcs vnth interior angles not 
greater than ir . 

To establish this fundamental property of concave boundaries we imagine 
the given neighborhood mapped analytically upon a plane. This is merely 
done for convenience of statement, as will be seen. 

Let us now construct a network of squares which contains all of the points 
of the boundary T within it, and let us take the sides very small. Further, 
let us reject all of these squares save those which contain both an inner point 
of the continuum and a point of its boundary. The non-rejected squares 
will lie in the small neighborhood of T. There exists an inner point Pq of C 
not within this small neighborhood. 

Now consider the open continuum C' obtained by adding all inner points 
which may be reached from P 0 along a rectifiable path of which no point is 
on a side of these non-rejected squares. 

No boundary points of this new continuum C' are boundary points of C 
also. For if such a common boundary point were within a square of the net¬ 
work, the square would be a non-rejected square, and thus the point could 
not be approached from Po. And if the boundary point lies on a side of a 
square but not at an end-point, neither square which abuts on this side is 
one of the rejected ones unless it contains no inner points of C. Hence if 
both contain an inner point we are led to the same difficulty as before. Where- 


19 



218 


G. D. BIRKIIOFF 


[April 


as if only one of the two squares contains an inner point, we must approach 
the common boundary point from P 0 through that non-rejected square, which 
is not possible. Finally if the common boundary point is at a vertex there 
are four abutting squares. Again we must approach the point through the 
non-rejected squares of these four which also contain inner points of C , and 
thus the same contradiction arises. 

Since there is no common boundary point of C and C' we infer that the 
boundary r' of C' is an inner broken line without double points, made up of 
the sides of the non-rejected squares. Moreover T' lies near to T and encloses 
between itself and T all of the non-rejected squares. 

We now reject further those squares which have no side in common with T'. 
The final set of non-rejected squares will have sides which appear in a certain 
circular order on r'. Let K x , K 2l • • •, K m be the vertices on r', in circular 
order, at which two of these squares meet. Each arc A'i K 2 , K 2 K 3 , ••• , 
K m A'i of T is made up of at most three sides of a square. There must be such 
points K since there is more than one square. 

A vertex Ki of this kind is an end-point of at least one side of a non-rejected 
square not forming part of r'. Beginning with this side let a point traverse 
the edges of the non-rejected square until a boundary point L k of T is reached. 
There is at least one such point on a side of the square since there is at least 
one boundary point in the square, and since T does not lie wholly within 
the square. Obviously such a boundary point will be reached before the 
point returns to a point of r' again. Thus with each vertex AT, a point L, 
may be associated. The broken line from L t to A, is made up of less than 
four sides of a non-rejected square and lies between T and r'. 



Any arc of the type L X K X K 2 Z* forms an arc interior to C save for its 
two end-points which are boundary points of T. Moreover its length does 
not exceed eleven times a side of a square. 

Two successive arcs L k K it such as L x K x and L 2 K 2 , can have no point in 
common save possibly their end-points L . For, otherwise, part of L x K x K 2 L 2 
having no point of T on its boundary would enclose one or more non-rejected 
squares, and would divide T into two parts without a common limit point, 
which is not possible. We conclude that the points L x , A*, • • • , L n appear 
on C in the same angular order as the points K x , K 2 , •••, K n on C\ 


/ -5 fc 

*cc. Kn. 




20 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


219 


1917J 


Since T is concave, the unique short orbital arc L x Z 2 taken together with 
the broken line Z, K x K 2 Z 2 encloses a region which contains no points of the 
boundary of C within it (Fig. 1). Consequently any arc such as UU lies in 
C and it is not possible to pass from P 0 to the boundary of C without 


touching one of these orbital arcs. 

Evidently this set of arcs or segments of them, will form a boundary T" 
made up of orbital arcs lying near the boundary T. This boundary Y" will 
be a concave boundary of the desired type if the interior angles of T" do not 


exceed tt . 

At a vertex of r" interior to T the interior angle is seen at once to be less 
than 7 r since it is formed by orbital arcs which do not terminate at the vertex. 

It remains only to prove that the interior angles at the vertices Z, are not 
greater than i r; if an interior angle is equal to tt, the two abutting arcs may 
be united. 

But if this angle did exceed tt we should be led to a contradiction. Let 
AD be an orbital arc joining points A and B on opposite sides of such a hypo¬ 
thetical vertex. Then AD lies in the exterior angle. • If AD contains no 
boundary point on it, a second short arc AD might be drawn in the interior 
angle at the vertex, and the two arcs AD would have no boundary point 
on them although enclosing a boundary point at the vertex; this is not 
possible. 

On the other hand if AD does contain a boundary point, there will be a 
first such point A' from the side of A and similarly a first point D' from the 
side of D . But in this event the orbital arc A'D' (which coincides with part of 
AD) , and the curve A'ABB' are both short arcs, if AD be properly taken in 
the interior angle at the vertex. Moreover the latter arc is interior to C save 
at A' and D'. Thus the boundary would not be concave since these two 
arcs include the vertex. 

9. The minimum method in the reversible case. We will say that an orbit 
is of minimum type if J =» J* is not less along any nearby closed curve than 
along the orbit; if J is not less along any nearby closed curve on one side of 
the orbit, the orbit will be termed of unilateral minimum type. Our first 
result concerning these orbits is restricted to the reversible case, when J is 
positive, and may be stated as follows: 

Given a continuum C on the characteristic surface of a reversible problem with 
y > 0 in C, and given a rectifiable closed curve A along which J < J 0 , and 
which is not continuously deformable to a point on C under the restriction J < J 0 . 
Then, if every boundary of C is either concave (see § 8) or is such that A cannot 
be continuously deformed to approach anywhere a point of that boundary under 
the restriction J < J 0 , there will exist a periodic orbit on C into which A can be 
continuously deformed under the restriction J < J 0 , which is either of minimum 


21 



220 


O. D. BIRKHOFF 


[April 


type and wholly within C, or of unilateral minimum type and coincident with 
one of the boundaries.* 

Let us commence with the very simple and important case when there are no 
boundaries. 

Take n large and divide A into n arcs along each of which J has a small 
value J 0 /n. Each of these arcs may be continuously varied into the short 
orbital arc with the same extremities. The possibility of doing this depends 
essentially on the fact that A lies within C. If we allow a point P to move 
from one extremity of the arc to the other, the orbital arc from the initial 
point to P combined with the curve from P to the final point furnishes a 
curve which deforms continuously from the given arc to the orbital arc with 
the same end-points, while J never increases.! By treating the n arcs in 
this way we deform the curve into a curve A' formed by n orbital arcs, also 
within C, under the restriction J < Jo. 

Consider now any set of n points Pi, P 2 , •••, P n arranged in a given 
order, and with successive points (P* and P x being counted as successive) so 
near that J along the short orbital arc which joins successive points is not 
greater than Jo/n. The total value of J along the combined arcs will be 
indicated by J(P Xt P 2t ••• , P„) and will not exceed Jo- Thus J may be 
looked at as a positive continuous function of position in the analytic 2n- 
dimensional manifold C 2n . determined by n points of the characteristic surface 
and defined over the part D 2n in which J along each arc is not greater than 
Jo/n. This part D 2n is bounded by analytic manifolds, corresponding to 
coincidence of two successive points or to the fact that J along the orbital 
arc joining two such successive points equals Jo/n. 

The initial curve A' evidently furnishes a “point” (Pi, P 2 , •••, P„) 
of D 2n . We restrict attention to that continuum of D 2n which contains 
this point. By continuous variation of a point in D 2n it is evidently possible 
to arrive at the point of D 2n at which J has an absolute minimum in that 
continuum. 

Now pass to the corresponding minimizing curve P x , P 2t • • • , P n which 
can be deduced from A' by continuous variation with J < J 0 . 

The angles at the vertices of this curve are all 7 r. For, on a properly 
taken characteristic surface, J denotes arc length, and the orbits become the 
geodesics. If the angle at the vertex P x is not tt, with P„ as center let us 
strike off a geodesic circle through P x , which will cut P n Pi orthogonally on 
the characteristic surface. With P 2 as a center let us strike off a second such 
arc through P x . These two circles will necessarily have a region in common 
precisely beca use the angle at the vertex is not 7 r. If P; be a point within 

• Compare with Signorini (loc. cit.) whose method of attack is essentially the one here 
employed. 

t We assume here and later the minimizing property of short extremal arcs. 


22 



1917 ] 


221 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

this common region the short geodesics P„ P'i and P[ Pi will be shorter re¬ 
spectively than P n Px and Px P 2 . If we allow Px to vary continuously from 
Pi to Pi within this region, J will further decrease while the point (Pi, h, 
... t P n ) remains within Din • This is impossible. 

This argument is unaffected if several vertices are coincident at Pi, in 
which case we treat them as constituting a single vertex. 

It follows that the minimizing curve Pi, Pt, • • • , P n is a periodic orbit. It 
remains to prove that this orbit is of minimum type. 

To see this we note that by taking n points Pi, Pi, •••» Pn along the 
orbit so as to divide it into n arcs of equal length less than Jo/n, we can get 
an interior point of Dm which corresponds to the minimizing curve. If there 
exists a curve near the orbit (i. e., such that corresponding points of curve and 
orbit are uniformly near together) along which J is less than along the orbit, 
we could employ our first process of deformation to find a nearby curve 
Pi, Pi, • • •, Pn of orbital arcs for which J is less than along the orbit. But 
this state of affairs is impossible, for otherwise.*/ would be less at a point of 
Dm than at the nearby point for which J is a minimum. 

This completes our demonstration for the case of no boundaries. 

Suppose now that only concave boundaries are present and that these are 
of the simple type singled out at first, being formed by orbital arcs with 
interior angles less than r . 

Inasmuch as the curve A lies within C we may continuously deform A to a 
curve Pi Pi • • • P n made up of orbital arcs precisely as before. We are led 
again to define the positive continuous function J (Pi, Pi, • • • , P n ) where 
Pi, Pt, • • •, Pn now lie on C and may not pass its boundaries. The region 
Dm will be a closed continuum in 2n-dimensional space. 

It is important to note that no part of the curve Pi P 2 • • • P n can vary out 
of C as long as the points Pi, Pi, • • •, P n lie in C . This is obviously true for 
the particular type of boundary under consideration. 



Thus we are led to a minimizing curve Px P 2 • • • P„ as before, such that J 
has a minimum at the corresponding point of Dm. The only possible new 
complication is that this curve has a vertex P, on one of the concave boundaries 
T. Let us investigate this possibility. 

The orbital arcs of the minimizing curve on either side of any such vertex P, 
will lie wholly on the inner side of the nearby parts of T. Also the part of 

Trans. Am. Math. Soc. 15 


23 



222 


G. D. BIBKHOFF 


[April 


the angle on the opposite side from T cannot exceed it (Fig. 2). If, however, 
this angle were less than 7 r a further deformation would be possible which 
reduced J further, just as before. We conclude that there are no actual 
vertices on the- boundary, i. e., that the angles are all equal to 7 r at these 
points. 

From this it follows that if there is a single vertex on a concave boundary T 
the adjoining sides must coincide with that boundary in a single orbital arc. 
By passing to adjacent vertices of Pi — P n it is inferred that T forms a 
single periodic orbit with which the minimizing curve coincides. 

Now J is not less along the nearby curves within C than along this periodic 
orbit. Such a possibility would lead at once to the conclusion that the orbit 
was not the minimizing curve, just as in the earlier case. 

Hence we see that either the minimizing curve furnishes a periodic orbit 
of minimum type interior to C, or a periodic orbit of unilateral minimum 
type and coincident with a concave boundary. 

If some or all of the concave boundaries are not made up of a set of orbital 
arcs we may approach to them, nevertheless, by concave boundaries of this 
special type (see §8). In this way a new continuum within C, possessing 
only concave boundaries, is built up. By means of it we can argue at once 
the existence of an orbit which satisfies the conditions of the theorem; for if 
the modified boundaries are sufficiently near the boundaries which they 
enclose, the curve A lies wholly within the new continuum, and the earlier 
argument may be applied to this continuum. 

Finally, we observe that, since A cannot be deformed with J <J 0 to approach 
one of the type of non-concave boundaries allowed to enter, these boundaries 
do not introduce any difficulties. 

This second kind of boundary was present in the surfaces of negative curva¬ 
ture considered by Hadamard (loc. cit.). 

It would be desirable to establish, if possible, the existence of an orbit of 
minimum type in every case, whereas in a special case we have only inferred 
the existence of an orbit of unilateral minimum type. The following example 
shows that it is not possible to go further. 

Consider the geodesics on a surface of revolution generated by revolving a 
curve y = f(x) about the x-axis. We will consider the section of the surface 
generated by the part of the curve between x = a and x = b (a < b) . Then, 
if the slope of the curve is zero at both x = a and x - b but negative else¬ 
where, this part of the surface is a continuum C in the form of a ring whose 
two boundaries are themselves closed geodesics and so yield two concave 
boundaries. Any circle in a plane perpendicular to the axis may be taken 
as a curve A. 

The closed geodesic of minimum length around the ring is evidently the 


24 



DYNAMICAL 8YSTEM8 WITH TWO DEGREE8 OF FREEDOM 


223 


1917 ] 


circle generated by the rotation of the ordinate at x = b. But if the generating 
curve ha9 an ordinate which diminishes further for x > b, the circle is of 
unilateral minimum type and not of minimum type. 

10. Concave boundaries in the irreversible case. Consider now any con¬ 
tinuum C taken in the ary-plane instead of on the characteristic surface. 
Le t P and Q be points of its boundary which are connected by an interior 
rectifiable arc PQ. Clearly if P and Q are end-points of interior arcs AP 
and BQ and if the two inner end-points A and B of these arcs are joined by 
some inner rectifiable arc AB, the three arcs connect P to Q. 

Arcs PQ of this kind lying near the boundary clearly fall into classes ac¬ 
cording to the number of times that PQ winds around the boundary. In 
particular, if a definite sense has been assigned to the boundary, there are 
arcs PQ which go from P to Q in the same sense and do not wind around the 
boundary at all. In this event we will say that P is positively connected with 
Q by the arc PQ . 

If, whenever a point P is positively connected with a point Q by an arc 
PQ of length less than d (d small), the region limited by this arc and the 
unique short orbital arc from P to Q contains no boundary points within it, 
the given sensed boundary wilj be called concave. 

A boundary made up of a finite number of arcs with continuous curvature 
and taken in a definite sense will be concave if the interior angles at the vertices 
are less than ir , and if the curvature towards the interior is not less than that 
of the orbit tangent positively to the sensed boundary. But the only par¬ 
ticular case to enter explicitly into our later reasoning is that in which these 
arcs are orbital. In this case the conditions for concavity are obviously 
satisfied. 


With this definition we can infer: 


An y 3en3ed «"icai>e boundary T of a continuum C can be approached by another 
concave boundary in C made up of orbital arc taken in the same sense with 
interior angles not greater than tt . 

The proof can be made by means of a slight modification of the corresponding 
proof in the reversible case (§ 8). 

Precisely as before we construct the broken line T' made up of sides of 
non-rejected squares near the boundary and again denote the vertices of T' 
at which two squares meet by K u K it ..., K m . But we now choose these 
vertices in the order which gives T' the same sense as T. 

Let now the non-intersecting broken lines AT, L { be constructed as before. 
It is apparent that the point L x is positively connected with L 2 by the broken 

line arc, L x and a similar statement holds for the other m - 1 arcs 

of the same type. 

Hence, if we construct the short orbital arcs L x L 2 , U £,, • • • , L m L x as 


25 



224 


G. D. BIRKHOFF 


[April 


before, we infer that these arcs lie in C. We recall that, by the defining 
property of a concave boundary, the region limited by the arc L\K\K-i L 2 
and the orbital arc L\ Lz , for instance, lies in C. 

Evidently these orbital arcs completely separate T from an interior point 
P 0 of C not near to T. The parts of these arcs accessible from P 0 form a 
curve T" of orbital arcs lying near C. Further, these accessible arcs have 
the same sense as T, for we cannot pass from P 0 to a point of an arc such as 
L\ L* on the side toward the boundary V . 

Thus the curve T" formed by these accessible orbital arcs yields a concave 
boundary of the stated kind. 

In the reversible case a fundamental property of concave boundaries made 
up of orbital arcs is that a short enough orbital arc with end-points in C lies 
wholly in C. 

An analogous property holds in the irreversible case, but requires more 
careful statement. We will take only the simple case which actually enters 
into the later reasoning. 

Suppose that T is a concave boundary made up of orbital arcs with interior 
angles less than ir at the vertices, and that C is a ring in the ary-plane. 

If a curve T* formed by short orbital arcs makes a circuit of C in the same 
sense as T without crossing itself, and if the vertices of T* lie within C , then 
T* lies wholly in C. 

The proof of this statement is immediate. The part of T* accessible from 
a point Pi outside of C but not near to it must evidently consist of parts of 
orbital arcs of T* or of the whole of such arcs. Moreover the sense of these 
arcs will appear to be the same as that of T, since T* does not cross itself. 
Bearing in mind the special character of T we see, however, that such an arc 
cannot lie outside of C . 

11. The ring criterion in the irreversible case. We shall prove the follow¬ 
ing result: 

Given a ring in the xy-plane throughout which X and y (see (1'), (4')) are 
positive, and whose boundaries are concave in one and the same sense. Then 
there will exist either a periodic orbit of unilateral minimum type coincident with 
one of the two boundaries, or a periodic orbit of minimum type without double 
points lying wholly within the ring and making a single circuit of the ring in the 
sense of the boundaries. 

The restriction that X be positive is not essential, but it is essential that X 
be of one sign. If X < 0 a mere interchange of the roles of the equations (1') 
(i. e., of x and y) brings us back to the case X > 0. 

We shall confine attention to the case when the boundaries are taken in a 
positive sense. Entirely similar arguments apply when the sense is negative; 
in fact, if there exists a point within the inner boundary, a direct conformal 


26 



1917] DYNAMICAL SYSTEM8 WITH TWO DEGREES OF FREEDOM 225 

transformation of the plane of the form w = l/(z — rj) will take this inner 
point to infinity, and throw the given dynamical problem into a similar one 
with the sense of the boundaries reversed. 

Moreover we assume at first that the boundaries are made up of a finite 
number of orbital arcs with the interior angles at the vertices less than 7 r; 
it has been seen that such boundaries are concave. 

(a) Existence of a minimizing curve T. Consider any analytic curve which 
makes a single circuit of the ring in a positive sense, and which has no double 
points. By joining nearby points on it by short orbital arcs taken in the 
same sense, one obtains a curve To made up of n 0 orbital arcs, each of length 
less than a small quantity d. The curve To makes a single circuit of the 
ring in a positive sense and has no double points. 

We propose to restrict attention to a class Y of curves P\ P 2 • • • P n made 
up of a fixed number n > n 0 of orbital arcs Pi P t , P 7 P 3 , • • • , P n P x each of 
length not greater than d, namely those which make a single positive circuit 
of the ring, and are either without double points, or merely touch themselves 
internally. 

Such curves T are clearly wholly accessible from the outer boundary of the 
ring. Also it is clear that if a curve made up of a set of n orbital arcs each of 
length not greater than d forms a boundary wholly accessible from without 
and described in a positive sense it must be a curve T. Since it is always 
possible to introduce further vertices P arbitrarily, the particular curve r 0 
chosen may be regarded as belonging to the class T for any n > n 0 . 

Let us choose any particular value of n > n 0 and let us consider the inte¬ 
gral J taken around a curve of type T. We will write J in the form (see § 2) 

J* - 5 - A, 

where we take 

S=f^ds, A - f(adx + 0dy). 

The component S has a positive value independent of the direction of 
integration and is analogous to arc length. The component A is analogous 
to an area and by Green’s theorem may be written (see § 2) 

ff\dxdy — k. 

Here the double integral is taken over the area within I\ and k is a numerical 
constant. 

If To denotes the positive minimum of 7 over the ring, and if l denotes 
the length of the curve T, the integral 5 will be at least as great as V2^Z. 
Since the double integral is taken over an area within the ring, A will be less 
than a constant u 0 , and thus we get for any curve T 

(21) J* > <2y*l - u 0 . 


27 



226 


G. D. BIRKHOFF 


[April 


If the vertices of T, taken in succession, are Pi, Pi , • - * » Pn we may denote 
the value of J* along T by J(Pi, Pi, P») • p x 

By the preceding inequality, the lower hound of J {.Pi,h, m ‘‘> r y 
exceeds - « 0 . By choosing a proper sequence of curves we can evidently 
make P,, P 2 , • • • , P„ approach simultaneously a set of limit points Pi, ii, 
... P„ respectively while J approaches this lowerjimit J. 

We propose to investigate the limiting curve T formed by the n orbital 
arcs Pi Pi, Pi Pi, • • • , Pn Pi and to show that if n be taken large enough 
this curve will form a periodic orbit of the type desired. It is obvious that 


J has the minimum value J along T , ... 

{h) Proof that V is of type T . Let Ti denote the part of T that is accessible 
from the outer boundary. This curve Ti is made up of orbital arcs formed 
from a part of an arc of T or from the whole of such an arc. Inasmuch as the 
orbital arcs are analytic, there are only a finite number of such arcs. 

It is evident that Ti makes a single circuit about the ring. I say further 
that, if the constituent orbital arcs are reckoned in the sense of increasing 
time, Ti will make a positive circuit of the ring. _ . 

For suppose that the sense of any arc PQ of Ti is negative. In this event 
PQ will be accessible from the outer boundary along analytic curves PL and 
QM ending at points L and M of that boundary which have no point in com¬ 
mon (Fig. 3), and such that the points L and M appear in a negative order 



along the outer boundary. Now let Pi and Q t be points on PL and QM 
respectively near to P and Q, and let P, Qi denote an arc lying within the 
region enclosed positively by PQML and uniformly near to PQ throughout 
its length. 

Consider now an approximating arc P' Q' of a curve T which will have a 
direction nearly that of PQ at any point. The arcs P' Q’ and Pi Qi are en¬ 
tirely distinct if P’ Q' be taken sufficiently near to PQ , and form a narrow 
“ canal.’* Moreover no point of the approximating curve T will He on the 
region P, Qi ML under the same circumstances, since no point of T i does. 


28 



1917 ] 


DTNAMICA 


SYSTEMS WITH TWO DEGREES OF FREEDOM 


227 


The analysis situs of the figure then renders it apparent that the approxi¬ 
mating curve (which makes a single positive circuit of the ring, is wholly 
accessible from the outer boundary and is either without double points or 
merely touches itself internally) must have a branch P" Q" passing through 
this canal in the opposite sense from that of IV Therefore it is apparent 
that the arc PQ also appears as a limiting orbital arc in the sense QP . 

But this is impossible. For if a sensed arc and the same arc taken in the 
opposite sense are orbital arcs, the curvatures in the positive directions along 
the two arcs would be the negatives of each other. However, the curvature 
formula (20) shows that the sum of the two curvatures is not zero but is 
precisely equal to 2X/V2*y, on account of the fact that the two values of 4> 
differ by t r along the two curves. 

We note in passing the fact that we may conclude further: // X > 0, the 
orbit having the opposite direction to that of a given orbit at a point tics to the 
right of that orbit near the point (Fig. 4). Indeed we see that the difference between 



the curvature of the given orbit in its positive direction and the other orbit meas¬ 
ured in the same direction is 2X/ 42y , a positive quantity. 

We have now proved that the orbital arcs which make up Ti yield a positive 
circuit of the ring. Also none of these orbital arcs can exceed d in length, for 
each of them is either the limit of an arc of curves T or of a part of such an arc. 

I say further that Ti does not contain more than n such arcs. 

There are only n vertices Pi, P*, • • • , P n on each of the approximating 
curves. Hence if it is demonstrated that every end-point of an orbital arc 
of Ti is the limit of at least one of these vertices, the statement will be estab¬ 
lished. But in the contrary case there will be an end-point of an arc of Ti 
which may be taken at the center of a circle with radius so small that all of 
the approximating curves from and after a fixed one have no vertices within 
this circle. Such a situation implies that the approximating curves, as far 
as they lie in the circle, are composed of a single orbital arc terminated by 
the circumference. 

Hence, if the two arcs of Ti which meet at the center form an exterior angle 
different from 0, ir , or 2ir , nearby approximating arcs will necessarily intersect; 
this is contrary to the assumption that the curves T do not intersect. 


29 



228 


G. D. BIRKHOFF 


[April 


This exterior angle cannot be on account of the relation between oppo¬ 
sitely tangent orbits noted above. 

If the exterior angle at the center is it , that point can count as a vertex only 
because it is the limit of one of the n vertices of the set of approximating 
curves T. The statement holds in this case. 

In order to eliminate the possibility that the angle at the center is 0 it is 
necessary to make use of the assumption that T was defined as the limit of 
a set of approximating curves for which J approaches its lower bound J. 
So far we have proceeded without the use of this assumption. 

When the angle is 0 the approximating curves near the center of the circle 
are formed by orbital arcs with direction almost parallel, one set of arcs having 
the oppositely directed set entirely on its right (see Fig. 5). Otherwise by 



the italicized remark above concerning the curvatures of two oppositely 
directed orbits at a point it would follow that the two approximating arcs 
intersect near to the center. Moreover we can choose two such arcs between 
which there are no others of the same type. The region between two such 
arcs evidently lies outside of the curve T, for this region lies to the right of T 
Now suppose a short orbital arc AB drawn across this region so as to join 
A to B in a positive sense (see figure). 

There are certainly one or more orbital arcs of T between A and B. In 
fact the arcs on which A and B lie do not meet near the center of the circle. 
Hence the number of these arcs is not increased if T be replaced by the curve 
T* obtained by the substitution of the orbital arc AB for the arc AB of T. 
Neither will the length of any arc of T* exceed d. 

Since all the vertices of T* lie in C, the orbital arc AB of T* lies in C (see 
end of § 10). Clearly then T* is of the type T. 

Now the value of J along T* is less than along T by a definite positive 
constant. In truth, the component A entering into J* = S — A is increased 
by the inclusion of the area bounded by the two arcs AB within the modified 
curve. Also the component S has been decreased by the deletion of an arc 
length which certainly exceeds the radius of the circle if the approximating 


30 



1917 ] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


229 


curves come near enough to the center, and has been increased by the sub¬ 
stitution of an arbitrarily short orbital arc AB. Consequently J has been 
diminished by a quantity exceeding a quarter radius of the circle multiplied 
by V 2 y 0 in all of the approximating curves from and after a fixed one. 

But this conclusion is incompatible with our assumption that J is approach¬ 
ing its lower bound J . 

Consequently the angle at the center is not 0 ._ 

This completes our proof that every vertex of I\ is the limit of at least one 
vertex of f, so that Ti has at most n constituent orbital arcs of length not 
greater than d. Since Ji is wholly accessible, from the outer boundary of 
the ring, ft is seen that Fi is a curve of type I\ _ 

Now if Ti forms only part of T, the value of S^or will be less than for T. 
Also the value of A wifi be at least as large for Ti since Ti includes within it 
all of the area within T. It is seen then that J is less for I\ than for the 
minimizing curve I\ which is impossible. 

It has now been proved that T is^a curve of type T. 

(c) Proof that for n large enough T is a periodic orbit. The vertices P i, P 2 , 
. • •, P n of T need not be true vertices of that curve, inasmuch as the angle 
at one or more of these vertices may equal x. Let Qi, Q 2 , • • •, Qk denote 
the true vertices, if any such exist. 

Among the k arcs forming T a certain number k i will be of length less than 
d while the others will be as great as d in length. If k 2 be the number of 
the latter vertices we have k\ + k 2 = k . 

Consider an arc, Qi Q 2 say, of the first type. Since T is of type T there 
will be no parts of the curve adjoining this arc on its outer (right-hand) side. 

The two exterior angles at the ends of Q i must exceed 7 r. In fact if 
both angles are less than x but neither of them is zero, an orbital arc Qi Q' 2 
drawn from one vertex Q 1 to a point_Qz near Q 2 on the arc of T following 
after Qi Q 2 will lie wholly outside of T. The side Qi Q 2 will be less than d 
if Q 2 be taken near enough to Q 2 . If then we consider the curve Qi Q' 2 • • • Qk 
it is apparent that it forms a curve T lying in C (see end of § 10). But the 
value of J taken along Q t Q' 2 is less than along <?, Q 2 Q' 2 since the orbital arc 
from Q 1 to Q 2 yields a minimum value of J as compared to any nearby arc 
joining the same two points. Consequently J would be less along the new 
curve than along T. 

Precisely the same argument is available to rule out the possibility that 
one exterior angle (say that at Q 2 ) is less than x but not equal to zero, while 
the other exceeds x. 

Moreover, if one of the angles, say that at Q 2 , is equal to zero while the 
other angle is different from zero, the arc which follows upon Q^ Q 2 will lie 
to its right and be tangent to it in a negative sense; here we recall the property 


31 



230 


G. D. BIRKHOFF 


[April 


of oppositely tangent orbits proved earlier. Hence in this case too an arc 
Qi Q' t will lie outside of f and thus our argument may be used in this case also. 

Finally the case of two zero angles may be excluded; in such an event 
both arcs adjoining upon Qi Q 2 would lie to the right of it and have the opposite 
direction to Qi Q 2 • An orbital arc Q i Q' 2 joining points Q\ and Q 2 on the 
arcs immediately preceding and following Qi Q 2 may be used in place of an 
arc Qi Q 2 . If Qi and Q' 2 are taken sufficiently near to Q i and Q 2 respectively 
the arc Qi Q 2 will lie outside of T; for otherwise <?, Q 2 and Q[ Q' 2 would cross 
twice at nearby points which is not possible. 

Consequently, if there are.any arcs Q, Q 2 of length less than d, the two 
exterior angles at the end-points exceed it . 

Now neither interior angle at an extremity of such an arc can be zero, 
since oppositely tangentj>rbits have been seen to lie to the right of each other. 
And, unless the curve Y touches this orbital arc on its inner side, an argu¬ 
ment like the above can be used to show that the interior angles must exceed tt . 
Hence we are driven to the conclusion that every orbital a£C Q i Q 2 of r of 
length less than d is touched on its inner (left-hand) side by Y . 

But we cannot have a point of contact of Y with Qi Q 2 save at a vertex 
Qi (t * 1,2). In fact at a point of contact not at a vertex the tangent orbit 
to Qi Qt would lie to the left of that arc, which is the inner side, whereas it 
can only lie to the right. 

On account of the fact that the interior angles at Qi and Q 2 are less than tt 
the exterior angle must be less than tt at such a vertex Q,. 

According to our earlier argument, each of the two sides abutting upon Q t 
will therefore necessarily be of length at least as great as d. 

Since there are k x arcs of length less than d, there must be at least k\/2 
vertices Qi of contact, one for each two arcs. 

But each arc of Y of length as great as d furnishes at most two vertices Q, 
so that there are at least k i/4 arcs of length as great as d whence k 2 ^ k 1 /4. 

On the other hand from the inequality (21) we infer 



where l denotes the length of Y . But the total length of the k 2 arcs of length 
at least d is as great as k 2 d so that we have 

, < |J| + M 

kl " V2 Vod • 

Bearing in mind the relation k = k\ + k 2 and the inequality k\ ^ 44* we get 


1«/1 4 - |kq 

V2y 0 


32 



1917] 


231 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

In other words the number of true vertices <?, on f does not exceed a speci¬ 
fiable integer, «,, no matter how large n > n 0 is taken. Of course J does not 
increase with n so that one and the same value of J can be used in this in¬ 
equality for any n > n Q . 

Now let us choose an integer n t such that 

> L 

n,> V2 Vod = d- 

In this case n 2 points interspersed at points along f in such wise as to divide 
it into n 2 parts of equal length will yield arcs of length less than d. 

Suppose now that we choose n > n x + n^. I assert that then T can have 
no vertices <?,. 

In the first place there are at most n, actual vertices on T. Secondly, it 
we insert n 2 points equally spaced along T and regard them as vertices also, 
the resultant set of arcs form a curve on which every arc is of length less than d , 
and we have not yet employed all of the available n vertices which may be 
assigned on T. _ 

No exterior angle of T can be less than tt, since otherwise we could insert a 
short external orbital arc across such a vertex, and thus decrease J without 
using more than n vertices. 

Also no interior angle can be less than tt . This possibility is at once ex¬ 
cluded in a similar way unless the curve T touches itself (on the inner side) 
at that vertex. But this would imply that one of the other exterior angles 
at the vertex is less than tt, and we are led to the case excluded above. 

Hence there are no true vertices for w sufficiently large. The curve T forms 
a periodic orbit which makes a single positive circuit of the ring and is wholly 
accessible from the outer boundary of the ring. Since this orbit is not tangent 
to itself on the left, it will be without double points. 

(d) Proof that T is of minimum type. Such an orbit may lie entirely within 
the ring. Consider any nearby rectifiable curve. On such a curve choose a 
series of points far apart in comparison with their distance from the orbit, 
but at a short distance from each other. The curve of orbital arcs joining 
these points in order will then form a curve of type T for a fixed large n along 
which J is less than or at most equal to the value it has along the given curve. 
But the value of J is not less along any curve T than along the curve T. 
Hence we conclude that the periodic orbit is of minimum type. 

If the periodic orbit touches the boundary of the ring it will coincide with it, 
and the same proof is available that was given in the reversible case. More¬ 
over the orbit is clearly of unilateral minimum type, and the above argument 
is available to prove this fact. 

(e) Extension to the case of general concave boundaries. We have thus 


33 



232 


G. D. BIRKHOFF 


[April 


proved the existence of a periodic orbit of the kind desired, when the two 
boundaries of the ring are made up of a finite number of orbital arcs with the 
interior angles less than x. But in § 10 it was shown that the most general 
concave boundary could be approached by boundaries of this special type 
and lying in the immediate neighborhood of the given boundaries. By the 
above reasoning we infer the existence of a periodic orbit on the new ring 
formed by the modified boundaries, and thus infer the truth of the theorem 
in the general case. 

12. Application of the criterion of § 11. In the reversible case, at least 
when the characteristic surface is closed and of genus p > 0, we were able to 
infer the existence of periodic orbits. In other cases the knowledge of boundar¬ 
ies of a particular type was required before the existence of periodic orbits 
could be inferred by the minimum method. 

A direct application of the result of § 11 for the irreversible case requires 
the knowledge of two boundaries of a particular type. One cannot hope to 
wholly avoid the use of such auxiliary boundaries, unless periodic orbits which 
intersect themselves are considered, as was not the case in § 11. In truth, 
if X be very large and positive throughout, the curvature formula (20) shows 
that the curvature is uniformly large and positive. Hence any orbit on the 
characteristic surface will necessarily either intersect itself, forming small 
loops, or it will tend asymptotically toward a small orbit of loop form. Thus 
no periodic orbits without double points exist on the surface except those 
deformable to a point. 

On the other hand, if a and 0 are so small that J exceeds some positive con¬ 
stant multiple of the arc length along every orbital arc, the methods of §§ 8, 9 
are available to prove essentially the same theorems for closed characteristic 
surfaces in the irreversible case as have been obtained in the reversible case. 
Here then is a case in which the existence of auxiliary boundaries is not required. 

In the present paragraph I shall show that the existence of one auxiliary 
boundary of a particular type suffices in many cases. 

If, in a given irreversible problem, X and y are positive throughout a closed 
characteristic surface C of genus p > 0, on which is taken a single boundary 
concave toward the region on its left and not deformable to a point on that region, 
there will exist a periodic orbit of minimum type without double points into which 
this sensed boundary is deformable on its left. 

To begin with we will assume that the concave boundary is made up of 
orbital arcs with interior angles less than ir . 

Let us suppose that the genus p exceeds unity. We may regard the char¬ 
acteristic surface as infinitely sheeted, and the given boundary as making a 
closed cut in this surface. This implies merely that a circuit like that along 
the boundary is by convention regarded as one which takes a point back to 
its initial position. 


34 



1917 ] 


DYNAMICAL SYSTEMS WITII TWO DEGREE8 OF FREEDOM 


233 


Consider now the reversible problem for which X is zero but y has the same 
value as in the given irreversible problem. The given boundary is concave 
in this reversible problem also. For, the curvature formula (20) shows that 
the curvature of an orbital arc in the irreversible problem exceeds that of the 
tangent orbital arc in the reversible problem by X/ V 2y . Hence the reversible 
arcs which touch the boundary will lie on its right, and the reversible arcs 
which join near by points of the boundary will lie within the region and on 
its left. This suffices to make clear that the boundary is concave for the 
reversible problem also. 

If we restrict attention to a large enough part of this continuum it will 
evidently be impossible to deform the boundary off of it, even in part, with¬ 
out making J (in the reversible problem) very large. 

Consequently the result of § 9 shows that there exists a periodic orbit of 
minimum type for the reversible problem, lying within this continuum, into 
which the boundary may be continuously deformed on its left. 

The giveh boundary and part or all of this periodic orbit evidently form the 
two boundaries of a ring on the continuum. The orbit, however, yields a 
concave boundary in the irreversible problem when taken in the same sense 
as the given boundary. Indeed the tangent orbits in the irreversible problem 
have greater curvature and thus are externally tangent to the ring. Thus 
orbital arcs (in the irreversible problem) connecting nearby points in the 
positive sense lie wholly within the ring, and the boundary is concave. 

Applying the result of § 11 to this ring we obtain the stated conclusion for 
p > 1, at least if the given boundary is composed of orbital arcs. But it 
was proved in § 10 that the given boundary can always be enclosed by a con¬ 
cave boundary of this special sort which lies in its immediate neighborhood. 
Hence the orbit will exist for p > 1 in all cases. 

When p = 1 , and the given boundary is deformable to a point on its right- 
hand side, the cut continuum obtained by the same process as before may be 
mapped on an infinite cylinder on which there is a single boundary which can 
be deformed to a point on its right-hand side. The same argument is appli¬ 
cable as before. 

When p = 1 and the boundary is not deformable to a point on its right- 
hand side, the cut continuum is analogous to the part of an infinite cylinder 
bounded by one base which corresponds to the given boundary. In this 
case the boundary can be deformed to the infinitely remote parts of the con¬ 
tinuum without J (in the reversible problem) increasing indefinitely. A 
modification of the preceding argument is therefore required. 

It has been proved that there exists a periodic orbit in the associated re¬ 
versible problem which is derivable from the given boundary by deformation. 
On the cylinder this appears as any one of a set of equally spaced curves 


35 



234 


G. T>. BIRKHOFF 


[April 


making a single circuit of the cylinder. The ring suited to replace the ring 
used in the other cases is that bounded by the given base and one of this 
congruent set which is taken so remote that it will not intersect the base. 
The rest of the discussion may be made as in the other cases. 

13. An example. The condition that X is of one sign, or some restriction 
of similar import, is essential to the success of the minimum method in the 
irreversible case. I shall present an example to prove that the minimum 
method may fail if X is not of one sign. More precisely, it will be established 
by an example that if this restriction upon X be removed, the minimizing 
curve T which minimizes J among all curves T (see § 11) may not be a periodic 
orbit for any choice of n. 

We will consider a ring in the xy-plane whose two boundaries are con¬ 
centric circles with center at the origin of coordinates. By a conformal trans¬ 
formation of the type treated in § 2 we may take this ring into the square with 
opposite vertices at (0, 0), (1, 1 ) in a new iy-plane (Fig. 6), so that y - 0 



and y = 1 correspond to the same radial line y = 0 in the xy-plane. Such a 
transformation leaves (l')» (4') unaltered in form, while X and y are periodic 
functions of y of period 1. Conversely, from a dynamical problem in the 
xy-plane where X and y are periodic in y of period 1, we may pass back to a 
problem over a ring in the xy-plane. _ 

The integrals J , S, A, and the corresponding integrals J , S , A are of course 
equal along corresponding curves. 

For convenience we will first construct an example in the iy-plane, and 
later interpret it in the xy-plane. 

We will take_ 2y equal to unity in the xy-plane. By (20) the curvature k 
then becomes X. The orbits will accordingly have curvature at each point 
equal to a given function of position X. 

The function X will be chosen positive along the line x = 0, and negative 
along the line £ = 1. Under this hypothesis tangent orbits along £ = 0 


36 



1917 ] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


235 


nd X = 1 with direction of motion toward the y-axis will lie outside of the 
strip 0 ^ x ^ 1 near the point of tangency. In the xy-plane the corre¬ 
sponding circular boundaries are therefore concave jn a positive sense. 

We will restrict X still further. We will take X to be zero along a line 
x = a <1 near * = 1, and negative for * > a. For £ < a we will take X 
to be positive outside of a circle lying within the unit square, and large save 
near the circle and x = a; within the circle X is to be negative, and large 
save near its circumference. 

It js clearly possible to meet all of these requirements and still to have X 
analytic in x , y and periodic in y of period 1. 

In the xy-plane the function X will change sign along the image of the 
circle and of the line i ■ fl. 

If the earlier hypothesis X > 0 was superfluous, we ought still to be able to 
affirm the existence of a periodic orbit of type T along which J was an absolute 
minimum, at least among all curves T which make a single positive circuit of 
the ring. Let us suppose that such an orbit does exist. 

The corresponding orbit in the xy-plane would be given by a curve joining 
a point of $ = 0 to the congruent point of y - 1, and lying wholly in the 
strip 0 ^ ™ 1 but not necessarily lying wholly in the unit square. 

Inasmuch as the curvature is large and of one sign save near the circum¬ 
ference of the circle and near £ = a, it follows that orbits not lying wholly 
near this circumference or line will have points of intersection with itself 
nearby if produced in both directions, or will wind around in a spiraliform 
orbit. In this way it appears that the periodic orbit assumed to exist must 
lie wholly near to £ = a . 

Since the orbit does not cross itself the motion at a right-most point (2 
a maximum) must be in the direction of the positive y-axis, and the curvature 
must be positive or zero at the point. Hence this point cannot lie to the 
right of £ — a where X is negative. Similarly a left-most point cannot lie 
to the right of x *= a. Consequently the orbit in question necessarily coin¬ 
cides with the line x — a. 

It is evident that the line x = a does yield a periodic orbit of type JH. 

Furthermore, any modification of this orbit to a curve in its near vicinity 

which joins a point of y = 0 to the opposite point of y = 1 cannot diminish 

the arc length S and will increase the area integral A , at least if the curve 

has no double points.* Hence the orbit x = a is of minimum type. 

However, we see readily that J does not have as small a value along x = a 

as along other curves joining opposite points of y = 0 and y = 1 on the strip 

0 = ^ = * an d without double points. We may take as such a curve one 

• It was shown in § 11 that any nearby curve could be replaced by one without double 
points and with a lesser value of J. 


37 



236 


G. D. BIRKHOFF 


[April 


that consists of ail of x = a except a short segment to the right of the center 
of the circle, and of a negative loop about this circle, replacing the deleted 
segment of x = a, with sides near together save near the circle (see line 
Pi P 2 Pi Pa Pt. of figure). By this modification S is increased by less than the 
length of the loop. The integral A = JJ\dxdy has now been increased by a 
large quantity, since the circular area has been excluded from the region of 
integration, and A is large and negative throughout most of this circle. Hence 
J has been considerably diminished. If the loop be made up of short orbital 
arcs the corresponding curve in the ary-plane will be a curve T along which J 
is smaller than along the periodic orbit corresponding to x = a which has been 
seen to be the only periodic orbit of type T . 

That is, there is no periodic orbit of type T which furnishes an absolute 
minimum for J. 

14. Scope of the minimum method. To what type of orbits is the mini¬ 
mum method applicable? In order to answer this question we have recourse 
to the differential equation (16) of normal displacement 

bn " + lbn - 0 

obtained in § 5 . Here the orbit from which the “ infinitesimal ” normal 
displacement ebn is measured is the given periodic orbit, and / is a periodic 
function of the time t having the period r of the orbit. 

Consider any solution bn of this linear differential equation which vanishes 
at t -* t 0 . The most general solution with this property is a constant multiple 
of any particular one, and we assume that the solution under consideration 
is not identically zero. 

If bn vanishes at a later time t = t\ then every solution will vanish in the 
interval ( t 0 , t x ) and likewise in the intervals (t 0 + kr, t\ -\- kr) where k is 
an arbitrary integer. Thus every solution will vanish infinitely often before 
and after t = t 0 . 

When this situation arises, and t\ is taken to be the first zero of bn after 
t = t 0 , there is defined a one-to-one sense-preserving analytic transformation 
from to to t x , or from the point Q 0 of the orbit which corresponds to to to the 
point Q\ which corresponds to tx. According to PoincarS,* with such a trans¬ 
formation of a closed curve into itself there is associated a unique real num¬ 
ber <7 defined by the following property: the mth transform t m of to lies be¬ 
tween the integral part of maT/2ir and the next greater integer. We shall 
call this number a the rotation number of the orbit; it measures the mean 
angular advance between successive points of crossing of the periodic orbit 
by an orbit in its infinitesimal vicinity. 

•Journal de mathlmatiques, ser. 4, vol. 1 (1885), pp. 167-244; in particular 
pp. 220-244. 


38 



1917J 


237 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

It may happen that the solution bn which vanishes at t - t 0 does not vanish 
again. In this case no solution can vanish more than once. For obvious 

reasons we will write <r = - in this case. ... 

It is only the orbits a = « which can be yielded by the minimum method. 
For then and only then will .7 have a minimum along the given periodic orbit. 
The proof of this fact is direct* and depends on classical methods in the 
calculus of variations. 

In order, however, that the criteria given above be applicable it is further 
required that the periodic orbit can be surrounded by two curves (to take a 
simple case) which are concave toward the ring which they delimit. We shall 
show that this further requirement can be met. 

By a multiple periodic orbit is meant one for w'hich the differential equation 
of normal displacement has at least one periodic solution (not identically 
zero) with the period r of the given orbit. Otherwise the orbit is called simple. 

The minimum method is applicable to all simple periodic orbits for which 

<r - <*> . 

To establish this fact we think of a pair of linearly independent solutions 
Sm, bn j of the displacement equation as the homogeneous coordinates of a 
point on the projective line. In virtue of the familiar relation 

(22) Sni tn 2 — bn 2 6n', «= c + 0, 

which obtains between bni, bn 2 , the point P describes the projective line 
continually in one sense. 

On account of the hypothesis <r = « the point P cannot describe the 
complete projective line, for if we had bn x /bn 2 = Ci/c 2 for two distinct values 
of t , the solution c 2 6ni — c x bn 2 would vanish twice. 

Hence as t increases from t 0 to t 0 + r a segment A 0 A i of the projective 
line is described. 

Since I is periodic in t of period t the solutions bn x , bn 2 will necessarily be 
replaced by certain linear combinations of themselves after t has increased 
by r. More explicitly, we may write 


bni ( t + r ) = abn (0 + bbn 2 (t), bn 2 ( t + r ) = cbn x ( t) + dbn 2 ( t) . 
The equation 

(23) . ad -be = 1 


holds because of the relation between 5ni and bn 2 noted above. 

Therefore as l increases further from t 0 4- T to t 0 + 2T we obtain a second 
segment A x A 2 which may be derived from A 0 A x by the projective trans¬ 
formation 

n 2 = an i + hn 2 , n 2 = cn x + dn 2 . 


293. 


See Poincare, Les milhodes nouixUts de la mlcanique cilesle, vol. 3 (Paris, 1899), pp. 283- 


Tram. Am. Math.Soc. 16 


39 



238 


O. D. BIRKHOFF 


[April 


In this way a series of segments A\ At, At At, • • • are obtained which are 
derived from A 0 A x by means of this particular transformation and its succes¬ 
sive iterations. Likewise by decreasing t a series of segments A -i A 0 , A - 2 A-\, 

• • • are suc<^ssively obtained which may be derived from A 0 A\ by use of the 
inverse transformation and its iterations. 

The totality of segments At A i+l so obtained will not cover the complete 
projective line, as has been noted earlier. 

This transformation must therefore have either one or two real invariant 
points. If there were no such points the transformation would necessarily 
generate the complete line. Hence by a proper choice of Sni, bn *—i. e., by 
a proper choice of points 0, « on the line—the above formulas will take one 
of the two forms 

5 ni(f + r) = p5ni(/)» bn 2 (l + r) = -5n*(<)» 

Sni (t -f r ) = Sni (f) » (f + t ) = Sni (<) + (O • 

It should not be forgotten that the transformation is direct and that (23) 
holds. Moreover in the first form we may exclude the possibility p — 1 
since the transformation is not the identity, and we may exclude the possi¬ 
bility that p is negative for in that case Sni would change sign infinitely often. 

In the second case the differential equation of displacement possesses a 
periodic solution Sni, and the given periodic orbit of minimum type is to be 
regarded as a multiple periodic orbit. 

We are now prepared to establish that the minimum method is applicable 
in all cases tr — «o , at least when the periodic orbit is simple. 

In the first of the two cases above we are at liberty to assume p < 1. Be¬ 
cause of the multiplicative property of Sni expressed above, that function 
cannot change sign once without doing so infinitely often. Consequently 
bn i is of one sign for all values of t, and may be taken positive after multi¬ 
plication by a constant. The same property shows that Sni will approach 
+ oo for lim t = + 0 o and will approach 0 for lim <-= — <». Similarly, 
if bn 2 be taken positive, bn 7 will approach 0 for lim t = + « and » for 
lim t sb — eo . 

The solution bn = Sni -f bn 2 is everywhere positive and approaches -f °o 
for lim t_ = ± oo . Suppose that bn admits an absolute (positive) minimum 
for t = t. If we plot r = 5n as a curve in polar coordinates and 2tt t/r as the 
angular variable, that curve will lie outside of a circle r = c to which it will 
approach most nearly for t = t. Moreover it will recede indefinitely from 
that circle as t — t increases indefinitely in absolute value. It is therefore 
intuitively evident that there will exist a single loop of the curve with interior 
angle less than x at the vertex. 


40 



DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


239 


1917] 


A corresponding slightly displaced orbit, of normal distance approximately 
proportional to 8n will therefore form an orbital loop on one side of the given 
periodic orbit with an interior angle toward that orbit of magnitude less than tt . 
A second corresponding slightly displaced orbit will form a second such loop 
on the other side of the orbit. The two orbits taken together form the two 
concave boundaries of the ring of which they are the boundaries. Hence the 
minimum methods of §§ 9, 11 may be applied, at least unless the dynamical 
problem is an irreversible one in which X changes sign along the periodic orbit. 

But evert in this exceptional case the minimum method may be applied. 
To this end we recall that a and 0 in the integral J are merely fixed in so far 
that the equation a y — 0 X = X is to hold. If then the orbit is taken con¬ 
formally into the x-axis this relation may be satisfied by putting 



0 - 0 . 


The values of a and 0 so obtained are zero along the orbit. Going back to 
the given variables we see that the functions a and 0 may be chosen so that 
they both vanish along the orbit, and are small near that orbit. The minimum 
method has been observed to be applicable in such a case (see beginning of § 12). 

Since nearby orbits may recede indefinitely from a periodic orbit a = oo 
without crossing it, we shall call these orbits completely unstable. The results 
of the present paragraph show that other methods must be devised to discover 
other types of orbits, and we proceed now to give a method of this sort. 

15. The principles of minimum and of minimax. The algebraical minimum 
principle upon which the criteria for the orbits of minimum type may be 
based is the following: 

Minimum Principle. If an analytic function J is defined throughout a con - 
linuum (in n-dimensional space ) and is less than J' at some interior point P„, 
and if along the entire boundary either J exceeds J ' or the normal derivative of j 
toward the interior region is negative, then there exists an interior point P at which 
J has a relative minimum J < J'; and sick that a point P may vary continu¬ 
ously from P 0 to P within the continuum while J remains less than J'. 

This principle is an immediate consequence of the observation that the 
continuum containing P 0 defined by the inequality J < J' necessarily contains 
a point P at which J has an absolute minimum J throughout this continuum 
On account of the conditions imposed along the boundary (if there is a bound- 
ary), the point P will lie within the original continuum as well as within the 
continuum J % J'. Thus J has a relative minimum at P 
Another type of point at which all the directional derivatives of a function 
J V “£ 13 , defi " ed as follows: If J ° is the value of J at a point P 0 and if the 
inequality J < J 0 — ( where < is small and positive defines more than one 


41 



240 


0. D. BIRKHOFF 


[April 


region near the point P 0 , then P 0 will be called a point of minimax. If the 
inequality defines k regions in the neighborhood of Po» that point will be 
said to be of multiplicity k - 1 . Clearly P 0 is not a point of minimum. 
In the case n = 1 , Po is a point of maximum. 

The algebraical principle upon which the consideration of orbits of minimax 
type will be based is the following. 

Minimax Principle. Let a function J be analytic within and continuous 
throughout a continuum (in n-dimensional space) possessingm-fold linear con¬ 
nectivity, and let there exist l points of minimum P \, Pt , • • • ,Pi in the continuum. 
If, whenever a point P is varied from a point Pi to a point Pi IN the continuum 
with J ^ J' , it is possible to continuously modify the path of P into another 
path from P. to Pj within the continuum along which we have J ^ J' , then there 
exist at least m + l — 1 points of minimax within the continuum. 

This second principle does not seem to have been as explicitly employed as 
the companion minimum principle given above. It may also be established 
easily. 

Let us begin with the case m = 0 so that the given continuum is simply 
connected. Consider the regions of the given continuum defined by the 
inequality J ^ p. When p is less than the absolute minimum of J through¬ 
out the continuum there are no such regions. As p increases through the 
minimum of J we get a single region which includes within it the corresponding 
minimum point P which we will take to be Pi. More generally let us assume 
that the minimum points P, have been so arranged that J\ = J* « ••• « Ji» 
At present we pass over the special case when some of the quantities J % are 
equal. 

As p increases still further this region expands and may reach the boundary. 
For p — Jj a second region comes into existence about the point Pj. This 
region will expand also with further increase of p. 

Thus as p increases we have / regions coming into existence about the points 
Pi, P 2 , • • • , Pi in order. 

Meanwhile various ones of these regions may have united. Such a junction 
cannot take place along the boundary of the continuum on account of our 
hypothesis. For, if a junction were to occur at a point of the boundary, 
say for J = p' , it would be possible to join the corresponding points P, and Py 
contained within these regions by a line lying in the continuum along which 
J ^ p'; in particular we should have necessarily J = p' at some point of the 
boundary along this line. But by our hypothesis such a line may be deformed 
into a second line, joining the same two points P and lying within the con¬ 
tinuum, along which we have J ^ p' This state of affairs indicates, however, 
that the two regions under consideration either have united for p < p' con¬ 
trary to our assumption, or have united at an interior point for p = p' . Con¬ 
sequently the regions will unite first at points within the continuum. 


42 



DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


241 


1917] 


Now, when p has increased sufficiently, all of the regions will have united 
into a single region comprising all of the given continuum. Consequently 
there are at least l — 1 interior points of junction required unless more than 
two regions unite at a single point. A point of junction is of course a point 
of minimax. Counting multiplicities properly, we have always at least l — 1 
points of minimax. 

Furthermore, the possibility that some of the quantities J are equal merely 
means that more than one region comes into existence for the same value of p, 
which in no way affects our argument. 

If we had assumed that the linear connectivity of the given continuum 
was not zero, an entirely analogous argument would have led us to the con¬ 
clusion that there existed m + l — 1 points of junction. For, as each junction 
takes place, either the number of regions J ^ p is diminished by unity or the 
total linear connectivity of these regions is increased by unity, but not both. 
Also we can infer that such a junction takes place within the continuum: 
otherwise there would be a type of line joining a point P, to a point Pj along 
which J ^ p, while no line deformable into it exists lying wholly within 
the continuum. This is contrary to hypothesis. Thus there are at least 
to 4- l — 1 points of junction in the general case. 

It is interesting to determine the characteristic property which distinguishes 
points of minimax from other points at which all of the directional derivatives 
vanish. At any point where these derivatives vanish, the function J may 
generally be expanded in the form 

J - p =fc X* ± x\ =fc • • • =fc x\ + • • • , 


where X\, x% 9 • • •, x n are properly chosen variables. If all the coefficients 
are positive we have a minimum point; if all are negative, a maximum point. 

If all of the terms in the expansion beyond those of the second order be 
omitted, there is obtained a set of quadric surfaces J - p which approximate 
to the given surfaces.in form near the point x x = x 2 = ••• = x n = 0 under 
consideration. We shall treat this form of J only, but it is readily seen 
that the argument is essentially applicable to J in its original form. 

Let us suppose that the first k of the coefficients are negative and the others 
positive. Then the points J = p' < p may be interpreted as follows: Con¬ 
sider two spheres of radii r, and r 2 with their centers at the origin in an x,, z 2 , 

• • •, x k space and in an x*+,, x* +2 , • • •, x n space respectively. Let us impose 
the condition 

- r\ = p - p ' ( p ' < p ). 

A pair of points, one on each sphere, evidently corresponds to a point on the 
approximating manifold J = p'. 

Any possible pair of values of r x and r 2 may be obtained from any other by 


43 



242 


G. D. BIKKHOFF 


[April 


continuous variation. Also any point of a sphere can be continuously varied 
into any other without modifying the radius, at least unless we have a one¬ 
dimensional sphere, consisting of two distinct points only. Hence, unless we 
have k = 1 or k = n — 1, the region J < p' is made up of one piece near the 
point. 

Moreover we observe that r 7 can be continuously varied from positive to 
negative values if r» and r* are connected by the above equation, while ri 
necessarily remains of one sign. We see then that this manifold J < p con¬ 
sists of a single piece for k = n — 1. 

Therefore the case k = 1 is the only case which can yield a minimax. But 
in this case ri remains either positive or negative no matter how r 2 varies. 
Thus we have actually two regions J < p' , and a corresponding point of 
minimax. 

Our hypothesis concerning the boundary of the given continuum may per¬ 
haps appear somewhat artificial. A slight consideration shows, however, 
that the hypothesis will be satisfied if the boundary possesses a continuously 
turning tangent plane and if the inner normal derivative of J is negative at 
every point of the boundary. In this event a line in the continuum joining 
two minimum points may be deformed so that each point moves along the 
stream line of the function J. This deformation generates another line 
lying wholly within the continuum, along which J is less at corresponding 
points than before. Thus the hypothesis will hold. We have chosen that 
particular form of statement which makes possible an immediate application 
of the minimax principle. 

16. The minimax method for p > 0 . Reversible case. The types of peri¬ 
odic orbits which we are about to consider have the following characteristic 
property: the value J ' of J is not a minimum along the orbit, and nearby 
curves for which J < J' fall into two distinct classes, no member of one of 
which can be continuously deformed into a member of the other under the 
restriction J < J'. These orbits will be termed of minimax type. 

We shall first take the case when the characteristic surface is of genus 
p > 0 which is somewhat simpler: 

If the characteristic surface is closed and of genus p > 0 in a reversible problem 
with y > 0, and if there exist l > 0 periodic orbits of minimum type deformable 
to a paint, then there exist at least l periodic orbits of minimax type deformable to a 
point. 

If there exist l ^ 1 periodic orbits of minimum type deformable into one another 
but not to a point on the characteristic surface, then there exist at least l or l — l 
orbits of minimax type into which they may be deformed, according as p = 1 
or p > 1. 

Let us commence with the case when there exist l orbits deformable to a 


44 



1917] 


243 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

point. It is convenient to employ the geodesic interpretation of J as the arc 
length on the characteristic surface. 

Clearly it is possible to choose a value of the arc length J so large that a 
curve of length less than this value J' may be continuously deformed from 
any one of the minimum periodic orbits into any other of the set or into a 
point. As such a curve varies from any one such orbit to any other or to a 
point, a set of n orbital arcs Pi P 2 , P* Pj , * • * , Pn Pi joining n points of that 
curve taken at equal arc intervals will also form a curve which also varies 
from one orbit to the other or to a point. The value of J along the modified 
curve will not be larger than along the original curve, and each arc of the 
new curve will be of arc length less than d = J'/n . Here it is supposed that 
n is taken large. 

Now n points Pi, Pt, • • •, P* of this sort ranging independently over the 
given characteristic surface evidently determine a 2n-dimensional analytic 
manifold. We will denote this manifold by C 2m . 

A set of n points Pi, Pi, • • • , P* such that successive points are not at a 
distance greater than d from each other corresponds to a single point of C 2n • 
The totality of such points evidently forms one or more continua lying within 
C 2n . One of these continua, say D 2n , will contain l points K\, K it • • • , K t 
corresponding to the n orbits of minimum type. 

Since there are various ways of choosing the vertices along a periodic orbit, 
such an orbit will yield more than a single point. In fact each vertex may be 
varied along the orbit, so that to an orbit corresponds an n-dimensional region. 
Part of this region will lie on the boundary, since the vertices may be varied 
into coincidence or so as to be at a distance d apart. On the other hand part 
of the region lies within D 2n for, if we take the vertices at equal distances from 
each other, their distance apart along the curve will be less than d. We recall 
that the arc length is less than J' along any such orbit. Hence K x , K 2 , • • • , K t 
may be taken within D 2n . 

Each of these l points give a minimum value of the function J. In the 
contrary case there would be nearby points of D 2n for which J is smaller than 
at the point, and this would correspond to a curve of orbital arcs on the char¬ 
acteristic surface near an orbit of minimum type but yielding a smaller value 
of J. This is not possible. 

Likewise the point curve obtained by letting Pi, P 2 ,.•••, P„ coincide also 
yields a minimum 0 for J and lies on the boundary of D 2n . Let us denote 
the corresponding point of D 2n by K,+ x . 

In order to apply the minimax principle of the preceding paragraph it is 
sufficient to be assured that if one can pass from one of the points K . to another 
with J mJ[ < J' by means of a curve in D 2n , then it is possible to pass from 
one of these regions to the other by means of a curve within D 2n along which 

J ~ Ji. 


/ 


45 



244 


G. D. BIRKHOFF 


[April 


This must necessarily be the case. Such a curve corresponds to a con¬ 
tinuously varying set of curves P x P t • • • P n on the characteristic surface, of 
which the first-is one orbit of minimum type while the last is another (or a 
point). Take a set of points Qi, Q%, •••,<?« along an arbitrary curve of 
this sort so that the curve is divided in n equal parts and construct the orbital 
arcs Q 1 Q 2 , QiQi, • • • , QnQi- In this way we get a modified curve Qi Qi 
• • • Q n along which J is not larger than along the first curve. Moreover, 
since J may be taken less than J' along the curve P X P% ••• P H vCach Arc of 
the curve Q x Q 7 • • • Q* will be less than d = J’/n. By this process we may 
replace the given scries of curves by another of the same sort but with the 
further property that every orbital arc is of length less than d. If we prevent 
the vertices from coinciding throughout the variation by a very small modi¬ 
fication first of the path of the vertex Q 7 so as to avoid Qi , then of Q* so as 
to avoid Qt , and so on, there results a sequence of curves Qi Qi • • • Q n , 
varying from one orbit of minimum type to the other (or to a point) and corre¬ 
sponding to a line within D 7n . 

Applying the minimax principle referred to we infer that there exist l points 
of minimax of the function J (Pi, P 7l •••,?») within D 7n . Let us develop 
the properties of the corresponding curves Pi Pi • • • P n . 

Let (xi,yi), ( x 7 ,y 7 ), •••, (x„,y„) be the coordinates of the points 
Pit Pit ••• t Pn respectively. These 2n variables form a suitable set of 
coordinates of a point in D 7n , at least near the point which corresponds to the 
minimax. The integral J becomes then a function of these variables. Of 
course the condition that the directional derivatives all vanish is independent 
of the particular choice of variables made in D 7n . 

However, the formula for the variation of J with a single vertex P , when L 
has the normal form (10) with a = /3 = 0, is 

(24) 6J - (a?! — x 7 )6x + (y', — y\) by , 

where x and y denote the coordinates of that vertex, and x \, y’ t and x \, y J 
stand for the values of dx/dt, dy/dt in a forward and backward direction 
respectively at the vertex. Hence, in order that the directional derivatives 
all vanish, it is necessary and sufficient that x[ and y\ are respectively equal 
to x’i and y\ at every vertex, i. e., that the two orbital arcs which abut upon 
the vertex have the same direction. 

It is therefore seen that every minimax point corresponds to a periodic 
orbit. We must still prove that these orbits are-of minimax type. 

Such an orbit cannot be of minimum type. For the corresponding point 
of Di n is a minimax point so that nearby points of D 7n can be found at which 
the function J is less than at the minimax point. This implies that there is 
a curve of orbital arcs nearby at which J is less than along the periodic orbit. 
This would not be possible if the orbit were of minimum type. 


46 



1917 ] 


245 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

It has been seen earlier that any curve near a periodic orbit for which 
j < j' can be deformed continuously into a set of n orbital arcs each of length 
less than d, while J is made constantly to diminish from its initial value along 
the curve. In fact, if we choose n points of the curve at a distance less than 
d apart and very near to the orbit, the arc joining two successive points of 
this set may be continuously deformed into the unique short orbital arc con¬ 
necting the same two points. If this deformation of all the arcs be made, 
the required deformation of the original curve is accomplished. Moreover, 
if one such curve varies into another with J < J' the corresponding curves of 
orbital arcs may be varied one into the Other with J < J'. 

To such a curve of orbital arcs there corresponds a point of Dm near the 
minimax point which yields the periodic orbit. 

Consequently we are led to infer that there are precisely as many distinct 
classes of curves near the orbit which cannot be deformed into one another 
with J < J' as there are regions J < J' in Dm which merge at the minimax 
point. 

Therefore the orbit is of minimax type, and, if we agree to count it according 
to the multiplicity of the number of classes of curves J < J ', the first state¬ 
ment made at the outset has been completely proved. It will be found later 
(§ 19) that there are at most two such classes, so that this possibility of multi¬ 
ply taken orbits of minimax type does not really arise. 

Suppose now that the given periodic orbits of minimum type are not de¬ 
formable to a point on the characteristic surface, and that we have p = 1. 
In this case the characteristic surface is torus-shaped and any such orbit may 
be deformed into itself on the surface by slipping around it. Consequently 
if we.form the continuum Dm as before that region will be doubly connected. 
If we consider the torus to be developed upon an infinite right circular cylinder 
in such wise that the given orbits of minimum type correspond to closed curves 
on the cylinder, the operation of slipping a curve around the cylinder may be 
defined as a continuous deformation of such a curve which takes such a curve 
into a congruent adjacent curve on the cylinder. A corresponding path in 
Dm can evidently not be deformed to a point, for that would mean that a 
set of curves joining a curve to a congruent curve might be continuously 
modified to a single curve, whereas it will always join a curve to a distinct 
congruent curve. 

Thus we infer the existence of l orbits of minimax type in this case by the 
minimax principle of the last paragraph with m = 1. 

On the other hand if we have p > 1 it will not be possible to slip an orbit 
on the characteristic surface into itself except through a set of curves which 
may be modified to a single curve. Otherwise the set of curves would sweep 

out a torus-shaped part of the characteristic surface, and there is no such 
part for p > 1 . 


47 



246 


G. D. BIRKHOFF 


[April 


Here then we can only infer the existence of Z — 1 orbits of minimax type. 

17. The minima* method for p = 0 . Reversible case. The minimum 
method afforded a proof of the existence of periodic orbits for p > 0. The 
minimax method has an especial interest in the case p = 0. From it we 
shall infer the existence of one periodic orbit of minimax type in the case 
p = 0, at least for reversible dynamical problems. 

If the characteristic surface is closed and of genus 0 in a reversible problem, 
and if there exist l S 0 periodic orbits of minimum type, then there exist at least 
l-\-l periodic orbits of minimax type. 

We commence with the simplest case / — 0. Here the intuitive formula¬ 
tion of the method of proof becomes very clear if one adopts the geodesic 
interpretation used above (see the introduction). 

Consider any family of curves on the characteristic surface, analogous to a 
family of parallel circles on the sphere, and defined specifically as follows: 
(1) the curves form a continuous series of which the first and last are point 
curves, (2) the curves are rectifiable with an upper limit of length, (3) one 
and only one curve passes through each point of the surface. 

Such a set of curves is evidently expressible in terms of two parameters: 
first, an angular parameter v of period 2v which varies along each curve of 
the family, and secondly a parameter y which varies from 0 to 1 as the curves 
vary, so that y = 0 and y = 1 correspond to the point curves. 

A set of curves of this sort will be said to form a normal covering of the 
characteristic surface. Any family of curves, of which the first and last are 
point curves, and which may be derived from a family which gives a normal 
covering by continuous variation, will be said to form a covering of the surface. 

A curve which passes in order through all of the curves of a covering will 
be said to slip over the characteristic surface. 

If a varying curve slips over the characteristic surface, every point of the 
surface will be a point of a curve at some stage of its variation. 

For, conceive of the normal covering as a continuous membrane which covers 
the given surface. Any other covering is obtained by continuous distortion 
from this particular covering. But this yields merely a distorted membrane 
which must still cover the surface, so that every point lies on some curve of 
the distorted covering.* 

Having introduced these preliminary ideas, we are prepared to give the 
application of the method of the preceding paragraph to the case p = 0. 

As before we consider n points Pi, Pi, • • • , P n as determining a point in a 
2n-dimensional manifold Cm • Also, if J' is taken so large that a curve of 

• A rigorous proof of the theorem of analysis situs involved is not given here on account 
of the obvious truth of the theorem. Such a proof can be made by commencing with the case 
of coverings which vary analytically, in which it is seen that every point is covered an odd 
number of times, and then passing by a limiting process to the general case. 


48 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


247 


length .1 on the characteristic surface may be made to slip over tne surface 
with J <J’, we define the manifold Dm as the region of Cm for which the 
successive points Pi, P t , ■ ■ ■. P. are not greater than a distance a — J /* 

"^family of curves Pi P 2 • • • P. made up of orbital arcs of lengths less 
than d and constituting a covering can now be obtained from the given cover¬ 
ing, with J < J' along each curve of the new covering. Such a family of 
curves corresponds to a line within D 2n joining one point J = 0, at which 
P\, Pt, • • •, Pn coincide, with another such point. 

Now the points of Dm for which J is zero evidently constitute a closed two- 
dimensional surface on the boundary of Dm, inasmuch as we have one point 
J = 0 for each point of the characteristic surface. 

The above line beginning and ending at a point of this two-dimensional 
boundary cannot be deformed to lie wholly in this boundary. In the con¬ 
trary case we should be able to deform the covering of curves made up of 
orbital arcs into a series of point curves, and not passing through a given 
point of the characteristic surface. This has been seen to be impossible. 

We infer that D 2n is not linearly simply connected. By the minimax prin¬ 
ciple of § 15 we therefore are led to infer the existence of a point of minimax 
within D 2n , for we have a point of minimum J = 0 in D 2n . 

Hence we infer as before that there will exist a corresponding periodic orbit 
of ininimax type. 

In the case when there are / > 0 given periodic orbits of minimum type 
the same method obviously leads to the conclusion that there exist l -f 1 
periodic orbits of minimax type as stated. 

Thus we see that there exists at least one closed geodesic of minimax type 
on any surface of genus 0 (see introduction). 

18. Introduction of concave boundaries. The results of the preceding para¬ 
graph admit of an easy extension when the characteristic surface possesses 
one or more concave boundaries: 

If the characteristic surface in a given reversible problem has one or more 
concave boundaries,* and if there exist l periodic orbits of minimum type deform¬ 
able into one another, then there will exist at least / or / — 1 periodic orbits of 
minimax type into which they may be deformed, according as the given orbits 
may or may not be deformed to a point. 

Let us suppose at first that the boundaries are formed of orbital arcs meeting 
with interior angles less than tt , or of a single periodic orbit. Precisely as in 
the case of a closed characteristic surface we may define a 2n-dimensional 
continuum C 2n and a second continuum D in lying within it. 

The essenti al difference is that here D 2n possesses a boundary corresponding 
• Distant boundaries may also be admitted as in § 9. 


49 



248 


G. D. BIRKHOFF 


[April 


to each of the given concave boundaries. When a vertex P< of Pi Pa • • * Pn 
lies on a concave boundary, the corresponding point of D 2 n lies upon a boun¬ 
dary of this description. Thus D 2n possesses boundaries of a new type as 
well as boundaries of the earlier type corresponding to the possibility that 
adjacent vertices of P X P 2 • • • Pn may coincide or lie at a distance d apart. 

Now let us suppose further that the region D 2n continues to satisfy the 
hypothesis of the minimax principle of § 15. If the given periodic orbits of 
minimum type can be deformed to a point we will have l corresponding points 
of minimum of D tm as well as a point of minimum corresponding to the value 
J = 0 obtained when the curve P\ P 2 • • • Pn becomes a. point. Thus there 
will be at least l points of minimax in this case, and at least l — 1 such points 
when the orbits can not be deformed to a point. 

We see then that in order to establish our results for concave boundaries 
made up of orbital arcs we need merely show that the stream lines of the 
function J pass from the boundary to the interior of D 2n everywhere along the 
new boundaries (see end of § 15). 

If a point moves along a stream line, the 2n coordinates (x, y) of the vertices 
of the curve P x P 2 • • • Pn will vary in such a way that the partial decrease 
of J due to the variation of each vertex (x, y) alone will be as large as possible 
when compared to the displacement of the vertex. This indicates that the 
direction of motion of each vertex of P x P 2 • • • P n on the characteristic surface 
is along the direction of the interior bisector of the angle at the vertex. Here 
the geodesic interpretation is serviceable. The direction at each vertex evi¬ 
dently depends upon the directions of the two abutting arcs only, and is the 
same as though only that vertex varied. The partial variation at a vertex is 
bJ given by (24), and is unaltered in form by a translation or rotation of the 
xy-plane. If we take a new origin at the vertex and a new y-axis along the 
bisector we have x! = x' t which shows that we have bx — 0. Also since J 
diminishes along the inner bisector, the direction of motion is along the inner 
bisector. 

Hence if we have a point on a boundary of D 2n , which corresponds to a 
curve P\ P 2 • • • Pn having one or more vertices on a concave boundary, the 
vertices of that curve will move away from the concave boundary as the point 
of D 2n moves along the stream line. It was observed earlier that as long as 
the vertices do not cross the concave boundaries, the orbital arcs do not. 

If the angle at any vertex is it the above argument fails. In this special 
case the boundary is made up of a single orbital arc near the vertex. Such a 
vertex lies on an orbital arc of P x P 2 • • • P n terminated by vertices at which 
the angle is not ir . As the corresponding point of D 2n moves along its stream 
line the end vertices move along the inner bisectors as before, while the other 
vertices on the arcs remain nearly stationary (see (24)). Hence the adjoining 


50 



DYNAMICAL SYSTEM8 WITH TWO DEGREES OF FREEDOM 


1917 ] 


249 


angles begin to become less than * on the interior side, and their vertices move 
toward the interior region. Thus this case is also disposed of. 

Since we may surround a concave boundary not made up of orbital arcs 
bv a second concave boundary made up of such arcs and lying in its immediate 
neighborhood (§ 8), it is clear that we may regard the italicized statement as 
demonstrated in all cases in which none of the concave boundaries consist 
of a single periodic orbit. We proceed now to consider this case. 

The possibility that such a bounding orbit is of minimum type is at once 
disposed of. For the stream lines will all move toward the interior of Dm 
on the corresponding boundary, save at those points which correspond to an 
arc Pi Pi • • • Pn making up this orbit itself. But these points correspond to 
a minimum of J in Dm which lies on the boundary of Dm • Such a point does 
not necessitate a modification of our argument. 

If, however, such an orbit is not of minimum type it may be approached 
by a concave boundary of orbital arcs. This fact will appear in (a) of the 
following paragraph. The new boundary may be used in place of that 
afforded by the periodic orbit, when the preceding argument becomes appli¬ 
cable. Consequently the italicized statement is true in all cases. 

19. Scope of the minima* method. By definition of periodic orbits of 
minimax type these have the characteristic property that nearby curves with 
a smaller value of J fall into two (or more) manifolds of curves which cannot 
be deformed one into the other with J less than along the periodic orbit of 
minimax type. In order to determine the scope of the minimax method we 
are led to inquire how many such manifolds of curves ./ < J ' there will exist 
along an arbitrary periodic orbit for which we have J = Along an orbit 
of minimum type there are no such curves so that we do not consider that 
case; this is the case <j = «o (see § 14). 

If a < tt along a given periodic orbit for which J = J’, any nearby curve for 
which J is less than along the given orbit may be deformed into any other such 
nearby curve under the restriction J < J'. If a > n but a 4 s « , any nearby 
curve with J < J' belongs to one of two classes, any two curves of either class 
being deformable into one another through nearby curves with J < , and the 

curves of one class not being deformable into curves of the other with J < J '. 
If (J = -K there are either one class or two classes of nearby curves with J < J '. 

(a) Proof that I may be taken positive. We commence by showing that, 
if (x + co along the given periodic orbit, the function I in the differential 
equation (16) of normal displacement may be taken positive. 

* or ° + 00 it has been seen that any non-identically zero solution 8n of 
this equation vanishes infinitely often as t ranges from — « to + « . On 
this account we can find a set of t intervals which include all the points of the 
interval 0 ^ t ^ r (r the period of the periodic orbit) as interior points, and 


51 



250 


G. D. BIRKHOFF 


[April 


each of which has two successive zeros of a solution bn for first and last 
point. For instance if a > 2tt a single interval of this sort can be found, since 
solutions exist which nowhere vanish in this interval. 

Regard now t and bn as the rectangular coordinates of a point in the plane 
and construct all the curves bn = bn(t) which correspond to the set of intervals 
formed by successive zeros. We will agree to alter the sign of the functions if 
necessary so that all of these curves shall lie above the t axis in the interval 
under consideration. Let us construct also all congruent curves obtained 
by shifting these curves by a multiple of r to right or left. All of the curves 
so obtained will represent solutions of the differential equation of normal 
displacement, inasmuch as the function / is periodic in t of period t. 

The complete set of curves so obtained and the /-axis evidently include a 
strip, of which the lower boundary is that axis and the upper boundary A is a 
series of arcs of curves which represent solutions of the differential equation 
of normal displacement meeting at angles greater than ir toward the /-axis. 
It is also apparent that the upper boundary may be obtained from the part 
0 2 s t % r by shifting this part to the right or left by a multiple of r. 

Incidentally we observe that on either side of the periodic orbit for which 
a =# oo a nearby curve of orbital arcs (corresponding to A) can be found 
meeting at angles less than ir away from the orbit. This fact was used at 
the end of § 18. 

Evidently this upper boundary may be looked upon as a curve whose curva¬ 
ture is equal to that of the tangent bn curve save at the vertices where it is 
infinitely greater away from the axis (Fig. 7). More exactly, it will be pos¬ 



sible to draw a nearby analytic curve A*, only slightly differing from this 
boundary, whose curvature, at each point will exceed that of the tangent 
curve representing a solution of the differential equation. 

Although the possibility of this last construction is intuitively manifest, 
we shall say a few words about the analytic details. Divide the /-axis into a 
set of intervals which include all save the immediate vicinity of the vertices 
and which are distributed into congruent sets obtainable from one set by a 
shift along the /-axis through a multiple of r. Immediately above an included 
segment of the boundary curve we may draw another Ai obtained by multi- 


52 



DYNAMICAL SYSTEMS W.TH TWO DEGREES OF FREEDOM 


251 

_ ——■ wrru "i* w i i iiKtvnr.c«o v* a ^ — 

1917 ] 

plying the ordinates of the boundary segment by a fixed constant 1 + a where 

• • ,ish,,y r 8h - 

out this interval, say to I+ d, where d is a small positive constant. The new 
differential equation 

in" + (/ + d)in = 0 


will have solutions which differ but slightly in position and direction from the 
solutions of the equation of displacement. A solution in can be found which 
is represented by a curve A, lying wholly between the narrow strip between 
A and Ai and is nearly coincident in direction with A. 

The curvature of this curve will exceed that of the tangent curve repre¬ 
senting a solution of the differential equation of normal displacement. In 
fact the ordinate and slope of the two curves arc the same, while in" for the 
new curve will exceed in" for the boundary arc by precisely din , as a com¬ 
parison of the two differential equations shows. 

Construct further the congruent curves of this auxiliary curve in the con¬ 
gruent intervals, and make a similar construction in other sets of congruent 
intervals. We obtain in this way a curve A, defined within the given set of 
intervals. 

We propose now to define the curve T* in the excluded intervals (see in¬ 
terval AD of figure) which fall into congruent sets. Take one of these short 
excluded intervals which contains a vertex of the boundary curve. Join the 
two adjacent ends of the arcs A* by a short arc whose curvature is large away 
from the /-axis save near the end-points and everywhere exceeds that of the 
tangent bn curve. Since the curvature is greater than that of the tangent 
bn curve at the end-points of the arcs of the auxiliary curve A 2 defined in the 
two adjoining intervals, we may make this short arc tangent to the auxiliary 
arcs at the common end-points with equal curvature. 

Also construct congruent short arcs in the congruent excluded intervals, 
and treat the other sets of congruent excluded intervals in like manner. 

In this way we complete the construction of a curve A 2 representing a single¬ 
valued function of t with continuous first and second derivatives, periodic of 
period r, which has the further property that the curvature of the correspond¬ 
ing curve at each point exceeds that of the tangent bn curve* 

Hence it follows at once that we may find a nearby analytic curve with the 
same property.* 

Now it is known to be possible to make a conformal change of variables 
in the xy-plane which alters arc lengths along a given periodic orbit in any 
desired ratio (see § 3), and thus small normal distances in nearly the same 
ratio. Imagine then that the particular conformal transformation has been 


53 



252 G. D. BIRKHOFF [April 

made which alters arc lengths in inverse proportion to the height of the aux¬ 
iliary analytic curve above obtained. 

Since bn is proportional to infinitesimal displacements from the given 
orbit, the transformed equation of normal displacement will have its normal 
distances affected in the same ratio. Hence the transformed auxiliary curve 
will be a line parallel to the new f-axis whose curvature at each point exceeds 
that of the tangent bn curve. This means merely that for bn > 0, bn' = 0, 
we have bn" > 0. We infer from (16) that I must now be positive. There¬ 
fore it is legitimate to assume that I is positive for <r =# « , as was to be proved. 

The condition a #= co is satisfied by the periodic orbits under consideration. 

(6) Construction of a minimizing curve P\ P* • • • P n in a strip. Let us 
now map the given periodic orbit conformally upon the 5-axis in an an-plane 
with preservation of arc lengths along the orbit, and let us confine attention 
to the strip contained between the parallels n = c t > 0, and n = c 2 < 0 
within which lies the 5-axis. On account of the relation between normal 
displacements along the periodic orbit and the solutions of the differential 
equation of normal displacement, we see that, for c i and c 2 sufficiently small, 
orbits tangent to the two sides of this strip will lie within the strip near the 
point of tangency. More generally we see that at any point of parallelism 
with the 5-axis and near to it the orbit is concave toward the 5-axis with a 
curvature which is of the first order in the distance from that axis and is 
given to terms of the second order by — In. 

We restrict attention to the curves near to the given orbit which lie in 
this strip. Such a curve may be thought of as joining a point s = 0 to a point 
5 = L (L the length of the orbit) having the same ordinate. The complete 
image consists of course of this segment and congruent segments obtained 
by a shift to right or left by a multiple of S. 

Introduce now a set of equidistant ordinates taken near together, of which 
the first is s = 0 and the last s = L. The nearby curve will intersect each 
of these ordinates at least once, say in the points Pi, P 2 , • • • , P n where Pi 
and P n are the initial and terminal points of the nearby curve. 

Consider the curve formed by the orbital arcs Pi P 2 , P 2 P3, • • • , P n -i P n . 
If ci and c 2 are sufficiently small these arcs will be almost parallel to the 5-axis 
and long in comparison with Ci and c 2 . A previous construction is available 
to deform the given nearby curve into the set of orbital arcs while J is decreased 
still more. This constitutes the first deformation of the curve under the 
restriction J < J' which we will make. The curve Pi P 2 • • • P„ of orbital 
arcs may not lie wholly in the strip, but its vertices lie in the strip. 

We will now vary the vertices Pi, P 2 , • • • , P„ in the strip up and down 
the vertical ordinates on which these points lie, while diminishing J further. 
The integral J appears as an analytic function of the ordinates of these 


54 



253 


1917] DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

oints in which the n variable ordinates vary independently between the 
k its Cl and c 2 . Consequently we may vary these points P, and with them 
the curve Pi P* • • • Pn t o a relative minimum. This constitutes the second 
deformation of the given nearby curve which we will make. 

(c) Proof that there are at most two classes of nearby curves P\ P 2 • • • Pn 
with J < J' • I* 1 us now turn to a consideration of the form of the minim¬ 
izing curve Pi Pi * * * Pn so obtained. 

In the first place there can be no vertex P, within the strip at which the 
angle is not tt. For wc may freely vary that vertex up and down with a 
variation of J given by the formula bJ = (y\ - y t )by (see (24)), where y 
denotes the ordinate of the vertex and y\ and y\ denote the slopes on the 
two sides of the point. This variation may be made negative if y\ and y' 2 
are unequal, which is impossible. But x\ and x 2 will be equal if y\ and y' t 
are equal by (4')- Accordingly the two arcs have the same direction at an 
interior vertex. 

The same formula shows that if a vertex lies upon the upper edge of the 
strip the angle toward the axis of s must be as large as n since y\ must be as 
large as y \. The same fact is true of the vertices on the lower edge of the 
strip. 

The extreme vertices P\ and P n require no especial attention. It is really 
the curve made up of Pi P 2 • • • P„ together with the congruent curves referred 
to above that are under consideration. 

The simplest possibility is that all the vertices lie upon one and the same 
edge of the strip. If these are on the upper edge each constituent orbital arc 
has no minimum point between its end-points, for the curvature of any such 
arc has been seen to be toward the 5-axis at a point of parallelism with that 
axis. Hence each such arc lies above that edge with one interior maximum 
point. Likewise if .the vertices lie upon the lower edge each arc will lie below’ 
that edge with one interior minimum point. 

We will establish that for c t /c 2 large no other possibility can arise. 

Suppose if possible that Pi P 2 • • • P n lies partially but not wholly within the 
strip, and let PQ be an interior arc of this curve ending on the sides of the 
strip. The arc PQ is evidently a single orbital arc. We recall that all interior 
vertices yield an angle i r. 

If the point P is not a vertex and we continue the curve to a vertex P,, 
the orbital arc P, P must lie wholly outside of the strip, and the orbital arc 
Pi Q contains a point of parallelism with the 5-axis between P, and P, that is 
near to P. On the other hand if P is a vertex the arc PQ cuts the preceding 
orbital arc of P X P 2 • • • P n with an exterior angle at least tt . If this angle ex¬ 
ceeds tt the curve PQ when prolonged beyond P lies below the preceding orbi¬ 
tal arc. Since it cannot intersect it again (nearly coincident orbital arcs do 

Traos. Am. Math. Soc. 17 


55 



254 


O. D. BIRKHOFF 


[April 


not intersect at two nearby points), the prolongation will cut the edge of the 
strip again between the ordinates corresponding to the two vertices of the 
preceding arc. Here again we have a nearby point of the prolongation of 
PQ which has a direction parallel to the 5-axis. In the limiting cases when the 
curve PQ touches the edge of the strip at P, and when the angle at the 
vertex is ir , the same thing is true. 

We are thus led to the conclusion that any interior arc PQ of Pi P* • • • Pn 
forms a single orbital arc which when prolonged beyond P and Q will neces¬ 
sarily have a point of parallelism with the 5-axis in the vicinity of either point. 

The ratio of the ordinates of orbital arcs at such horizontal points lies 
between fixed limits, just as the ratio of the ordinates bn at points of maxi¬ 
mum or minimum of a solution 6n of the differential equation of normal 
displacement remains between fixed limits. Hence if we choose C2 to be 
sufficiently small in comparison with c x the curve P\P* • • • Pn cannot have 
an interior arc PQ of which one end-point lies on the upper edge n = c of the 
strip. Otherwise there would be a horizontal point nearby, and the adjacent 
horizontal point would necessarily lie on the opposite side of the 5-axis and 
relatively much below n -* Ct. This is not possible. 

Hence only interior arcs PQ can exist which begin and end on the lower 
edge of the strip. 

But such interior arcs cannot exist either. For, observe that the orbital 
arc PQ must cross the axis between P and Q at least once; if it did not there 
would be a maximum below the axis which has been seen to be impossible. 
Let P’ and Q' be two adjacent points of crossing of that axis within PQ . The 
abscissas of P' and Q ' cannot approach those of P and Q any more than the 
points of zero slope and of zero value of the solutions of the differential equa¬ 
tion of normal displacement can approach each other. 

Now Q' is nearly the forward conjugate point of P' both along the orbit 
n = 0 and along P' Q'. Let P" Q" be an arc of PQ , lying between two of 
the equidistant ordinates, such that the forward conjugate point of P" comes 
before Q" . There will then exist curves near to P" Q" joining these same 
two points which give a lesser value to J than does P" Q " itself.* Hence by 
our first method we can replace this curve by a nearby curve of orbital arcs 
with vertices on the equidistant ordinates along which J is less than before. 
Thus Pi P 2 • • • P n did not afford a relative minimum of J among all curves 
Pi P2 • • • Pn derived by continuous variation, which is contrary to hypothesis. 

We see then that no interior arcs PQ exist. Moreover the minimizing curve 
P1P2 • • • Pn cannot lie wholly within the strip, In the contrary event this 
curve would form a periodic orbit cutting the 5-axis at least twice for 
0 ^ 5 ^ Z.. On this account the argument just used would show that the 
curve could not possess the minimizing property. 

• See Bolza, Varutiionsrechnung, pp. 83-84. 


56 



255 


1917] DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

Consequently we infer that all of the vertices P, lie either on the upper 
or the lower edge of the strip. There are thus at most two classes of nearby 
curves equivalent under deformation with J < J', and a representative in 
each class is furnished by the two curves P, P 2 • • • Pn with vertices on an edge 


of the strip. 

We shall finish a proof of the italicized statement by showing that if cr < tt 
there is only one such class, whereas if a > tt there are two classes. 

( d) Proof that there is only one class of curves J < J' if <r < tt . Suppose 
that we have <r < tt . In this case every solution of the differential equation 
of normal displacement vanishes at least twice in the internal 0 = t = r * 
Along the orbit the second forward conjugate point to s = 0 precedes s = S. 
We may therefore find points Pi, <?i, P 2 , Qt on the 5-axis lying in the interval 
0 ^ s % L in the order named, such that Q x follows the conjugate of Pi, and 
Q> follows the conjugate of P 2 . 

If now Pi be a variable point upon some intermediate ordinate and we draw 
the orbital arcs Pi Pi and Pi Q x we will get a varied curve Pi Pi Q i which 
will coincide with the 5-axis when Pi is upon that axis. Here it is assumed 
that the ordinate upon which Pi lies precedes the forward conjugate of Pi 
and follows the backward conjugate of Q\ so that Pi Pi and Pi Q\ are uniquely 
determined. The arcs Pi Pi and Pi Q x will meet at an angle which exceeds n 
away from the 5-axis. The second variation of J along Pi Pi Q x will be nega¬ 
tive if Pi varies upon either side of the 5-axis.* Now let us construct an 
analogous arc P 2 P 2 Q 7 and consider the curve made up of the two arcs Pi Pi Q x , 
P 2 P 2 and the 5-axis (0 ^ 5 < L). If both Pi and P 2 lie upon the same 
side of the 5-axis, J is less along this curve than along that axis. The same 
is true if Pi and P 2 lie upon opposite sides of the axis. 

Now let Pi be held fast while P 2 varies to the other side of the 5-axis. Next 
let P 2 be held fast while R x varies to the same side as P 2 . During this vari¬ 
ation J will constantly remain less than along the axis. In this way we can 
deform a nearby curve on one side of the axis to one on the other with J < J' 
throughout. 


But all of our preceding arguments apply to a strip 0gn<Ci, which is 
in effect the case c 2 = 0. When the initial curve lies in this strip we conclude 
then that it may be deformed to have its vertices upon one of the edges of the 
strip with J < J'. It cannot be deformed into the axis itself since we have 
J = J' along the axis. Therefore, when P, and P 2 lie above the 5 -axis the 
curve made up of P, P, Q x , P 2 p 2 Q 2t and the 5 -axis, may be deformed into a 
curve P, P 2 • •• P n with vertices upon n = c,, and likewise when P, and P 2 
lie below the 5-axis the curve may be deformed into a curve Pi P 2 • • • P n 
with vertices upon n = c 2 . 


•See Bolza, 
variation. 


Variationsrechnung, pp. 83-S4 for a consideration of this classical form of 


57 



256 


O. D. BIRKHOFF 


[April 


By combining these deformations or their inverses in a proper order we 
will deform a curve P\P» • • • P n with vertices upon n = c 2 into a like curve 
with vertices upon n = C\ through a series of nearby curves along each of 
which we have J < J'. Since it was previously established that any curve 
can be deformed into one of the two special positions of P\Pz • • • P n it fol¬ 
lows that all nearby curves with J < J’ may be deformed into each other 
with J < J'. 

( e ) Proof that there are two classes of curves J < J' if a > ir . It remains 
to prove that if a > ir the two special positions of Pi P 2 • • • P n do belong to 
distinct classes, i. e., cannot be deformed into one another with J < J' through¬ 
out. 

In order to do so we begin by associating with a point P of the given periodic 
orbit an opposite point Q which precedes the forward conjugate of P and 
follows the backward conjugate of P. It is precisely because of the fact 
a > ir that such a point will exist. Moreover P will then be an opposite 
point of Q. Now let P vary to Q along the orbit in one sense. During this 
variation the forward and backward conjugates of P will remain distinct so 
that we may select a continuously varying opposite point Q varying from 
Q to P in the same sense at the same time. Thus an involution of opposite 
points on the orbit is determined. 

Imagine now the orbit thrown upon a circle and let P ' and Q' denote the 
point of bisection of the arcs PQ and QP respectively, where P and Q are a 
pair of opposite points. The point P' stands in the same relation to P as Q' 
does to Q . 

Hence we define in this way a deformation of a point P of the orbit into a 
point P ' of the circle in such wise that opposite points become opposite points 
of the circle. It is possible that one point of the circle corresponds to more 
than one point of the orbit, but, since the correspondence is continuous, we 
can conceive of a deformation of the diameters of the circle (one diameter for 
each pair of opposite points) which makes the correspondence one-to-one. 
Thus we may always think of the pairs of opposite points as deformed con¬ 
tinuously into the opposite points of a circle. 

Now suppose the strip c* ^ n ^ C\ deformed into a double ring in a plane 
in such wise that radial lines correspond to the ordinates in the jn-plane and 
the orthogonal circles correspond to the lines n = const. Furthermore, we 
will suppose that the pair of ordinates which correspond to a pair of opposite 
points appear as superposed radial lines. 

If it were possible to deform a curve from above the 5-axis to below and 
not have opposite points on the axis at any stage, the corresponding curve in 
the transformed plane would appear as a closed curve making a double circuit 
of the ring which is deformed from one side of a given circle C to the other 
without having a pair of superposed points lying upon it at any stage. 


58 



257 


1917J 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


If this were possible it would be possible to approximate to the given family 
of curves by an analytic family which has the same property. For we must 
recall that the given family is representable by means of continuous functions 
of two variables which may be approximated to by analytic functions. 

Now in the initial position, we may assume that the first analytic curve of 
the family is not the same curve taken twice.* It is then apparent that there 
will be an odd number of superposed points on the curve. In fact there is 
only one double point for a properly chosen analytic curve which makes 
a double circuit of the ring, and any analytic variation can only introduce 
or remove these points in pairs. 

Now draw the analytic curves composed of all superposed points for the 
various members of the analytic family. Since there are an odd number of 
points on the first curve there are an odd number on one side of C at the out¬ 
set. Aj the curve varies only an even number are introduced or removed at 
any stage at one and the same point. Hence there must remain an odd 
number on that side of C unless there are points on C at some stage. At the 
last stage, however, there are none on that side of C. We conclude that 
superposed points on C must exist. 

Therefore, during the deformation of a curve Pi Pz • • • P n from the first 
special position on one side of the periodic orbit to the second special position 
on the other side, there will be an intermediate position when the varying 
curve cuts the orbit in a point P and its opposite point Q . 

Inasmuch as J from P to Q is a minimum along the orbit, and J from Q 
to P is a minimum along the orbit, this implies that J ^ J' along this par¬ 
ticular curve. It must not be forgotten that the conjugate point of P in either 
direction lies outside of PQ. 


Thus we have J S J' along one of the varying curves, which is contrary 
to our hypothesis. 

Our original statement is now fully proved. 

The minimax method yields therefore only periodic orbits along which 
<r = T ' 1 shaI1 not attempt to go further and show that all orbits of this 
type can be obtained by the minimax method. It is possible to give extensions 
of that method which do yield all of these orbits, but the conditions of appli¬ 
cation which I have found render these extensions practically useless. 

20. Method of analytic continuation. Reversible case. The preceding 
methods fail to apply for <r < x. We proceed now to a method which is not 
subject to this limitation, namely the method of analytic continuation. 

The results established by Poincare enable one to affirm that if the differ- 
ential equatio ns of the dynamical problem under consideration involve a 

will* J hlS Can be d ° n ! un,ess the varyin S curve is taken twice throughout in which case it 
will have a superposed point wherever it cuts the axis. 


* 


59 



G. D. BIRKHOFF 


258 


[April 


parameter p, then (1) the periodic orbits vary analytically with p and (2) 
they can only disappear or come into existence in coincident pairs. 

This method of analytic continuation is not applicable save for “ small ” 
changes in the parameter p . To make possible an extension to a preassigned 
interval mo = M = Mi. »t is necessary to prove that the period of the varying 
periodic orbit does not become infinite. The recognition of this limitation 
of the method led PoincarS to the formulation of his last geometric theorem. 
But the application of this theorem depends upon a construction known to 
be valid only for a “ small ” variation of the parameter. 

In the present and immediately following paragraph we shall show in a 
wide range of cases that the period cannot become infinite. 

Preliminary to the development of our result in the reversible case we 
shall establish the following fact: 

If the characteristic surface in a reversible problem is either closed or bounded 
by a finite number of ovals of zero velocity (7*0) without double points (yl 
+ 7j + 0) for mo — m = Mi» the number of intersections of a periodic orbit 
with itself remains unchanged with variation of p* 

Let us write the equation of the orbit in the form 

x ^fit, m ), y - m ), 

where / and g are analytic in t and p, and periodic in t of period t(m) with 
r ( m ) also analytic in m • 

It is evident geometrically that as t increases by r we have either described 
the orbit once or a finite number of times. Since we suppose that t(m) is 
the least positive period of the orbit for a general m » the orbit will be described 
only once as t increases by r save for exceptional values of p . 

At present we will bar out the possibility that the orbit consists for all m 
of a segment of a curve described in opposite senses. This case can only arise 
if ovals of zero velocity are present, on which the end-points of the segment lie. 

When m has not one of these exceptional values the orbit is nowhere tangent 
to itself. For if the direction of two branches of the orbit is the same or 
opposite at a point, the tangent branches must coincide throughout in a re¬ 
versible problem. It follows that variation in the number of intersections 
of the orbit with itself can only arise as m passes through one of these excep¬ 
tional values. 

Consider now two short arcs of the orbit which come into coincidence for 
M = m away from the ovals of zero-velocity. If we denote the normal dis¬ 
tance from a point P on one of these branches to the other by v ( t , m ) it is clear 
that v(t, m) is analytic in t and p, and by hypothesis vanishes identically for 
M = M- Hence we may write 

• Compare Poincard, these Transactions-, vol. 6 (1905), pp. 237-274. 


60 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


269 


v(t, fl) = (/X — H )* 1*1 (O + ®t (0 (M — M ) + •'•]» 

where »» (t) is not identically zero and k is a positive integer. 

It is clear, furthermore, that (t) is a solution of the differential equation 
of normal displacement for n = Jl. It follows that the zeros of i>i(0 are 
isolated, and that v\ (<) is not zero when r x (<) vanishes. 

The number of crossings of the two branches when /z is nearly equal to Ji 
is indicated by the number of roots of the equation 

®i( t) + f*(0 (m ~ m) + ••• = 0 


in the vicinity of a given value of t. If ti (/) # 0 for this value of t there 
are evidently no such values for n < Jx or n > ji. On the other hand if 
vi(t) = 0 we have v\(t) 4= 0, so that the usual theorems concerning the 
solution of implicit equations show that there is precisely one such .point of 
intersection. 

At least then, if there are no ovals of zero velocity, the number of points 
of intersection of the orbit with itself does not vary with n . 

In case such ovals gre present we need to prove further that no change in 
the number of intersections can take place in the vicinity of these ovals. The 
above argument becomes unavailable because a point y - 0 yields a singular 
point of the differential equation of normal displacement. We shall speak 
of such an oval 2 0 as a fixed curve (Fig. 8); a preliminary conformal trans- 



formation of the ly-plane, dependent on p of course, may be employed to 
take such an oval into a fixed position. 

Suppose that for g = ^ the periodic orbit under observation passes through 
a point of an oval of this sort. 

The coordinates xy of an orbit which at t = 0 gives a point on this oval 
admit an expansion of the form 


* *o + t* + ar 4 t* + • • •, y = yo + y 2 p + y A t < + ... 


61 



260 


G. D. BIRKHOFF 


[April 


in even powers of t . The fact that only even powers of t appear is an im¬ 
mediate consequence of the invariance of the equations of motion with a re¬ 
versal of the order of time and of a similar invariance of the initial conditions, 

x = x 0 , x' = 0, y = yo, y' — 0 (< - o). 

Moreover a substitution of these series in the differential equations yields 
x 2 = y Xf y 2 = y y . Now y x and y v are not both zero since the ovals of zero 
velocity are without double points. Hence the orbit is normal to the oval 
of zero velocity and approaches and recedes from the oval along one and 
the same analytic curve. 

The family T of orthogonal orbits (see figure) will therefore form a field in 
the vicinity of the oval, one and only one of these orbits passing through 
each nearby point.* 

Consider now any orbit which does not coincide with one of these orthog¬ 
onal orbits, but which, nevertheless, approaches near to the oval of zero 
velocity. Since the orbit is nowhere tangent to the curves of the field, it cuts 
them all in one and the same sense, and does not intersect itself. Moreover, 
since the equations of motion (l 7 ) in the case X = 0 may be regarded as the 
equations of motion of a particle (x, y) subject to a force with x and y com¬ 
ponents y x and y y respectively, i. e., directed toward the inner normal of 
the oval 7 — 0, the particle will approach and then recede from the oval, 
but will not remain in its immediate vicinity. Hence the orbit forms a species 
of open “ loop ” (see curve APB of figure) facing the inner normal of the 
oval. 

Unless a second branch of the orbit happens to approach the same point 
of the oval for y = m, these facts show that no points of intersection are 
introduced near the oval of zero velocity. 

In the excluded case, namely that in which the orbit consists of a segment 
described twice in opposite senses, no such points of intersection can appear 
near the oval, so that the italicized statement holds here too. 

The possibility that there are two branches of the orbit which approach 
one and the same point of the oval of zero velocity is to be looked upon as a 
combination of the two cases already disposed of. 

Let us consider this case briefly. Construct a family 2 of curves cutting 
the orbits T orthogonally; the oval of zero velocity 2 0 is one such curve. 
Suppose that the first of two nearly coincident loops cuts a curve 2i of this 
family in A and B while the second cuts in A' and B '. We may assume that 
AB and A'B' are similarly directed segments of this curve (see Fig. 8). 

When n is sufficiently near to m (m < y) the four points A , B , A ', B' will 
continue to have one and the same relative order provided the auxiliary 

• See. Bolza, VariaUonsrcchjnung, pp. 100-102. 


62 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


261 


curve 2o ,s properly chosen. Here there will be six functions v } (t) to con¬ 
sider since there are four branches approaching coincidence, and it will be 
necessary to avoid the zeros of these functions on the periodic orbit for m = M • 
Since the lengths AA' and BB' deal with the displacements of two corre¬ 
sponding branches of the orbits AB and A’B’ these lengths will be infinitesi¬ 
mals of the first order in n — JZ. Since AB’ and BA' deal with the displace¬ 
ments of corresponding branches of the orbits AB and B'A' , these lengths 
will also be of the first order. Hence the cross ratio of the lengths 
AA' • BB'/AB' • BA' approaches a definite limit, different from zero, as m 
approaches £. This shows that the pairs of points AB and A'B' either sepa¬ 
rate each other for all nearby values of n, or fail to separate each other for 
all such values. If they fail to separate each other and if either segment as 
A'B' lies within the other for m < M, then one segment will include the other 
for n > n also. For AA' and B’B are of the same order, and their sum is 
less than AB for m < M, and AA' and B'B have the same sign. Hence for 
n > H each of the lengths A A' and B'B has the same sign and is less than 
AB. This implies either that AB includes A'B' or that A'B' includes AB . 
Likewise if either segment lies within the other for n > £, the same will be 
true for n < ix . 

Hence, if it can be proved that the orbital arcs AB and A’B' intersect 
twice, or once, or not at all according as the segment AB (or A'B') is included 
within A'B' (pr AB), or as these segments partially overlap on , or as they 
are external to each other, it will follow at once that there are the same number 
of points of intersection of the two branches for n < £ and for /x > £. We 
will prove that this relation between the points of intersection of two nearby 
loops holds in reversible problems of the type under consideration. 

In the first place if AB and A'B' are external to one another, the corre¬ 
sponding orbital loops lie wholly between the curves T through A , B and 

A', B' respectively so that the loops cannot intersect. This case is thus 
disposed of. 

The other cases appear to require a further consideration of the orbits near 
the oval of zero velocity. Let us call the unique point of such an orbit at 
which it .s tangent to a curve 2 the vertex of the orbit; and let us call the curve I’ 
through that vertex the axis of the curve. 


Consider that orbit defined by the equations (1') alone which is tangent to 
a curve 2 at a point P ' with a velocity r. Let *, y be the coordinates of the 
point of tangency. This tangent orbit will cut 2 0 in two points ,1' and IS’ 

r: 1 ";, Rnd "\, b . e u the d ’ S ‘ ances of A ' *»d B’ respectively along 2, measured 
om the axis T through P . The coordinates u, and u, are evidentlv analvtic 

STi'; LT “ h " d0 “ “ »> - *'» °>-ai »f 


63 



262 


G. D. BIRKHOFF 


[April 


In order that these tangent orbits may also satisfy (4') it is only necessary 
that the velocity v equals V 27 . Thus for the solutions of (1'), (4') whose 
vertex lies at (x, y) we find 

U1 -/(*.»> tSt). *.=/(*,!/,- V2y), 

where/ is analytic in its three arguments. The same function gives U\ and u 7 . 

We wish to consider the variation of u x and u 7 as the vertex P' moves along 
a curve 2 from a fixed point P . To this end let us make a conformal trans¬ 
formation which throws the curve T on which P lies, into a new x-axis in such 
manner as to preserve arc lengths along the curve V (see figure). The re¬ 
sulting function/ then involves a parameter depending on the curve T selected. 
Inasmuch as the curves 2 are orthogonal to this new x-axis the variations of 
ui and u 7 are respectively given by dui/du and du 7 /dy . Let us compute these 
quantities. 

Suppose first that the vertex varies along the oval of zero velocity, in which 
case we have v — 0. Here we find at once 

dui/dj/ = du 7 /dy = /,(x,y, 0 ). 

It is clear that /„ is positive along the oval since the curves T into which the 
orbits degenerate form a field. 

Next suppose that the vertex varies along some other curve 2 within the 
oval. Here we find 

^ VS?) +/.(*» y. ^27)-^=. 

with a like formula for du*/dy . Now along the x-axis, which represents an 
orbit, we have y„ = 0 by the second equation ( 1 ') in this reversible case. 
Thus, in spite of the fact that 7 is a small quantity, we have the same ex¬ 
pressions for dui/dy, dut/dy as before. 

It follows that, as a vertex P' moves along any curve 2, the points A' 
and D' move along 2 j in the same sense with a relative velocity which is a 
continuous function of the position of the vertex. 

Consider orbits AB and A'B', and assume that the vertices of both orbits 
are very near to the oval of zero velocity in comparison with the distance of 
the curve 2» from that oval 2 0 . We will further assume that the vertex P 
of AB lies on a curve 2 which is nearer the oval than the curve 2 through 
the vertex P' of A'B '. This is clearly legitimate unless the two vertices lie 
on the same curve 2 , a possibility which will be considered later. 

If the vertex of A'B' is moved far enough along the curve 2 on which it 
lies, A'B' will evidently lie outside of AB and the two loop orbits AB and 
A'B' will not intersect at all. From such a position let the vertex move back 


64 



1917J 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


263 


along the same curve 2. According to what has been proved, the poiuts 
A', B' will move in the same sense, and the orbit A'B ' will commence to inter¬ 
sect the orbit AB as soon as B' has passed A'. This intersection will be on 
the A side of the vertex of AB and cannot leave that side until A' has also 
passed A; it should be observed that this point of intersection cannot pass 
the vertex of AB precisely because this vertex lies on a curve 2 nearer the 
oval than any part of the orbit A'B' . Likewise when B' passes II there is a 
single point of intersection introduced on the B side of the vertex of AB, 
which cannot disappear until A' has also passed B . 

It is not conceivable that after A' passes A there are still points of 
intersection on the A side of the vertex of AB, for such points could not 
disappear thereafter and yet are not present when A'B' has moved to the 
other side of AB. Likewise there are no points of’ intersection on the B side 
of AB after A' has passed B. 

Thus there are two possibilities for the orbits AB and A'B' when the vertex 
of A'B' lies in its initial position and the segments AB and A'B' have a part 
in common: either A' or B' lies without AB, in which case there is just one 
intersection; or A'B' includes AB, in which case there are two points of inter¬ 
section. This is in agreement with our statement. 

If the vertices of the orbits AB and A'B' lie on the same curve 2, a slight 
displacement of the orbit A'B' will move its vertex to lie on a different curve 
2 without altering the relative position of the segments AB and A'B' on 2i, 
and without altering the number of intersections of the two orbits. Hence 
our statement is true in this case also. 

Thus the number of intersections of the given analytically varying periodic 
orbit with itself is unchanged even when there are two or more branches of 
the orbit which pass simultaneously through a point of the oval of zero velocity. 
Thus our italicized statement is proved. 

We are now prepared to prove the following fact: 

If the characteristic surface in a reversible problem is closed or bounded by a 
finite number of ovals of zero velocity without multiple points, and if, further, 
every orbit is cut by nearby orbits in its immediate vicinity at least once in any 
interval of time 0, then the period of a periodic orbit can not become infinite with 
variation of a parameter p . 

Let us suppose that the statement is not true and that the period of some 
periodic orbit does become infinite as /x approaches a value £. 

At the same time the length of the orbit must become infinite. For to a 
short interval t of time corresponds a minimum positive length of orbit. 
Otherwise by a limiting process we arrive at a point orbit, so that we have 

7 i.- 7 ; ~ “ 0 at a P° int ’ contrary to the hypothesis that there are no 

multiple points on an oval of zero velocity. 


65 



264 


G. D. BIRKHOFF 


[April 


Jt is possible to go further and assert that the arc length of the part of the 
orbit outside of a small enough neighborhood of the ovals of zero velocity 
becomes infinite. Here we recall the loop form of orbits in the neighborhood 
of such ovals. This form shows that, if a neighborhood of these ovals be taken 
small enough, more of any part of an orbit corresponding to a fixed small 
interval of time will lie outside of that neighborhood than within it. 

Hence we may select a point of the characteristic surface, lying outside of a 
fixed neighborhood of the ovals, near which there is a large arc length of the 
orbit. But the orbits are approximately rectilinear. Hence it is apparent- 
that there is a direction at the point which is approximately that of arbitrarily 
many branches of the periodic orbit near the point. All of the nearby orbits 
must have approximately this direction since the number of intersections is 
fixed. However, by the assumed property of neighboring orbits each pair of 
these approximately parallel branches will intersect at least once in every 
interval 0, so that there will be a large number of intersections even in this 
case. Thus we are led to a contradiction, and infer that the period of no 
periodic orbit can become infinite with variation of y. 

As a simple application we may consider the variation of a closed geodesic 
without double points on a convex surface. In the case of an ellipsoid there 
exist three such closed geodesics given by its intersections with the three 
principal planes. Moreover nearby geodesics intersect within a fixed interval 
0 on a convex surface. 

Consequently, if the ellipsoid be varied analytically into a second convex 
surface through a series of convex surfaces, there will always exist at least one 
closed geodesic without double points on the resulting convex surface. Ad¬ 
mitting then that it is possible to pass from any one convex surface to any 
other in this way, it appears that there exists at least one closed geodesic with¬ 
out double points on any convex surface, for such orbits arise or disappear 
in pairs. 

This case was precisely the case treated by Poincare (loc. cit.), who also 
employed the method of analytic continuation. He did not explicitly men¬ 
tion the possibility that the length of the varying geodesic becomes infinite. 
It is precisely this possibility which has engaged our attention. 

It is interesting to observe that only the possibility that a period becomes 
infinite keeps us from inferring that there is a closed geodesic without multiple 
points on every surface. 

The minimax method has enabled us to infer that there exists a closed 
geodesic with <r > jt on every surface of genus 0. But it has not been estab¬ 
lished that such an orbit exists without multiple points. 

21. Method of analytic continuation. Irreversible case. It has appeared 
earlier in the paper that the irreversible case presents much greater difficulties 


66 



1917] DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 265 

than the reversible case. To legitimize the use of the method of analytic 
continuation for unrestricted variation of the parameter, I have been forced 
to make still more stringent hypotheses. Our first result will be the following: 

In an irreversible problem with characteristic surface of genus 0, throughout 
which we have 

X > 0, 7 > 0, I log 7 ]** + f log y)yy > 0, 

the period of a periodic orbit without double points can not become infinite with 
variation of a parameter p. 

Before passing to the demonstration we note that the characteristic surface 
may be conceived of as a convex surface, for when isothermal coordinates 
x, y are employed, the condition for positive curvature is precisely the last 
condition on 7 imposed above.* We shall regard the characteristic surface 
as a convex surface. The restriction 7 > 0 shows that no ovals of zero 
velocity arc present. 

A vital difference between the reversible case and the irreversible case is 
that in the latter case the number of intersections of the orbit with itself may 
vary. Let us consider briefly this possibility. • 

If the varying periodic orbit touches itself with two coincident directions 
for any value of p we infer at once just as we did in the preceding paragraph 
that the number of intersections of the orbit with itself will not change as p 
varies through this particular value. 

If the orbit touches itself with opposite directions at the point of tangency 
the two branches have only first order contact at the point on account of the 
fact that X is positive. In fact it has been observed earlier (§ 11) that in this 
case the curvature of the two branches differ and that each branch appears 
on the right of the other. 

As p passes through a value for which there is such a contact, the number 
of intersections of the orbit with itself may increase or diminish by two. 
Consequently the number of intersections may vary as p changes. 

Let us now give a proof of the italicized statement, and let us at first restrict 
attention to the case in which for the initial value p 0 of p the periodic orbit T 
is without double points. In this case we will call the simply connected 
region C which lies to the left of T the interior of T. 

If g stands for the geodesic curvature of T at any point of the characteristic 
surface, and p stands for the total curvature at any point of C, we have the 
well known formula 


2r= f gds + ff, 


385-387. Darboux ’ Le < ons sur la thiorie *** surfaces, vol. 2, second edition (Paris, 1915), pp. 


67 



266 G. D. BIRKHOFF [April 

where ds and do> are the element of arc along T and the element of area of C 
respectively. 

Now suppose /x to vary. At the same time T and the interior continuum 
vary, and as long as the curve T does not touch itself, the above formula 
holds without any modification of meaning. 

From this fact it can be inferred at once that if T does not touch itself 
the period of T will not become infinite. We will base this conclusion on the 
obvious inequality 

(26) f gd* < 2<r. 

We recall that in the associated reversible problem (X = 0) for which y 
is the same, the orbits may be interpreted as geodesics. 

In the ary-plane, however, the curvature of the orbits in the given irre¬ 
versible problem will exceed the curvature of the tangent orbit for the re¬ 
versible problem by precisely X/ ^2y (see (20)). Hence we will have uni¬ 
formly g > d > 0 throughout the characteristic surface. Thus if L is the 
length of the periodic orbit, (26) gives us L < 2tr/d. Since L is limited the 
period is also limited. 

We need then only consider the case in which as p varies the orbit T 
touches itself. However, on account of the fact that two oppositely tangent 
orbits lie to the right of each other, T will necessarily touch itself on the outer 
side first. 

Conceive of C as beginning to overlap itself as p increases beyond such a 
point. We may represent C as a membrane so that in the overlapping por¬ 
tion we have two layers of the membrane. With this convention, formula 
(25) will continue to hold no matter how many outer contacts are introduced, 
inasmuch as contacts of T will continue to appear as outer contacts of the 
membrane with itself. Consequently the integral JJ do)/ p taken over the 
membrane remains positive. 

Our earlier argument may now be applied to show that the period of T 
cannot become infinite with variation of p . 

The above result admits of the following simple generalization: 

If the same restrictions on the dynamical problem are imposed as before and if 
a periodic orbit may be regarded as the complete positively taken boundary of a 
simply connected piece of a Riemann surface lying in the characteristic surface, 
the period cannot become infinite with variation of a parameter p . 

As long as such a Riemann surface exists the formula (25) holds, provided 
we regard the interior of the piece bounded by the orbit as the region over 
which the area integration is to be performed. This area integral is obvi¬ 
ously positive so that we obtain the inequality (26) as before, and infer that 
the period is less than a definite positive quantity. 


68 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


267 


Thus we need merely to show that continuous variation of a piece of a 
Riemann surface of this type continues to be possible with the variation of p . 

As in the earlier case, no difficulty in making such a variation arises from 
the possibility of an inner contact of the boundary with itself, although outer 
contacts may necessitate the introduction of new overlapping regions. The 
Riemann surface may therefore be left unmodified until the boundary begins 
to approach a branchpoint in the same sheet of the surface. But nearby 
parts of the boundary cannot nearly surround the branch point since then 
there would an inner point of contact. Hence we may modify the branch¬ 
point to lie away from the boundary, say along the inner normal. 

Since the period of the orbit remains finite as long as such variation is 
possible, we conclude that the variation of the Riemann surface may be con¬ 
tinued indefinitely by an appropriate modification of the internal branchpoints. 
This establishes our statement. 

A figure-of-eight orbit constitutes the next simplest type of periodic orbit 
after those without any double point. Such a figure-of-eight orbit may 
always be thought of as forming the complete boundary of a simply con¬ 
nected part of a Riemann surface of two sheets with two branchpoints taken 
suitably. In fact we may deform the orbit into two nearly coincident curves 
on the characteristic surface encircling the included region twice in a positive 
sence. The single branchpoint of the piece of the Riemann surface lies in 
this included region. 

However, orbits with two double points may not have this property, and I 
believe that in such cases the period may actually become infinite. 

We state one more result of the same sort which applies for characteristic 
surfaces of any genus: 

If \ is sufficiently large and positive in an irreversible problem icith a closed 
characteristic surface of any genus, and if y is positive, the period of a periodic 
orbit without double paints cannot become infinite with variation of a parameter p . 

By the curvature formula (20) it follows that such an orbit will necessarily 
have large positive curvature throughout. Considerations of analysis situs 
render it apparent therefore that the orbit forms a small convex oval on the 
characteristic surface. In fact if the orbit joined two points at some con¬ 
siderable distance apart on that surface it appears that the orbit would inter¬ 
sect itself; this is evident if the region in question is mapped upon a plane. 
Hence the total orbit lies in a small part of the characteristic surface and 
forms a convex oval as stated. 

As p varies such an oval cannot change its form since the curvature remains 
large and positive. Thus the period will remain finite. 

All of the above results will undoubtedly admit of great extension. In 
particular it may be noted that the introduction of concave boundaries will 


69 



268 G. D. BIRKHOFF [April 

be possible under certain conditions inasmuch as a varying periodic orbit 
cannot become internally tangent to such a boundary. 

Part III. Reduction of the dynamical problem to a surface 

TRANSFORMATION 

22. The manifold of states of motion. The equations of motion (1') may 
be replaced by the equivalent differential system 

(27; x' y' -Xy' + y, Xx' + y. 

We now consider the variables x' = dx/dt, y' = dy/dt as well as x, y to be 
dependent variables. The relation (4') may be written 

(28) $(*'* + »'*)- T = 0. 

In conformity with methods long employed in dynamics we will interpret 
x, y, x', y' as the rectangular coordinates of a point in four-dimensional 
space. For obvious reasons a set of values x, y, x', y ' will be called a state 
of motion. Thus we have a three-dimensional manifold (28) representing 
possible states of motion and lying within this four-dimensional space. 

Evidently the equations (27) represent a steady fluid motion of this four¬ 
dimensional space which carries the manifold (28) into itself. The totality 
of orbits in the dynamical problems of the type (l')» (4') may therefore be 
thought of as represented by the stream lines of a three-dimensional fluid 
in steady motion. It has been noted (§ 6) that the fluid motion when repre¬ 
sented in xy<£-space is incompressible. In the original xyx'y'-space the volume 
integral ff dxdyd4> is invariant when taken over any part of the fluid. 

It is to be observed that the manifold (28) is an everywhere analytic mani¬ 
fold, at least if we agree to bar out the possibility that there exist double 
points on an oval of zero velocity. In fact only in this case can the four 
partial derivatives of the left-hand side of (28) vanish at a point of the manifold. 

The variables x, y, <t> cannot be used along the ovals of zero velocity, since 
the angular variable <t>, which indicates direction of motion, becomes in¬ 
determinate there. 

The connectivity of the manifold of states of motion is completely deter¬ 
mined by the genus of the characteristic surface and the number of ovals of 
zero velocity. We shall not elaborate this relation. 

23. Surfaces of section. A periodic orbit is represented by a closed stream 
line in the manifold of states of motion. If an analytic surface in this mani¬ 
fold is bounded by this stream line in such a way that nearby stream lines 
cut it throughout in one and the same sense, at an angle of the first order in 
the distance from the stream line, the surface will be said to be regularly bounded 
by the closed stream line. 


70 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


269 


A surface of section will be defined to be an analytic surface (or a surface 
made up of analytic pieces) regularly bounded by a finite number of closed 
stream lines, cut throughout in the same sense by the stream lines and at 
least once by every stream line in a fixed internal 0 of time. 

The notion of a surface ol section for certain types of dynamical problems 
with two degrees of freedom is due to PoincarS (loc. cit.). In the case which 
he considered, the dynamical problem differed slightly from an integrable 
case, and the surface of section was a ring. In what follows we shall show 
that, if the notion be extended as above to surfaces of any genus and any 
number of boundaries, the surface of section is a very general phenomenon. 

We propose now to illustrate the existence of such surfaces by two simple 
dynamical problems: 

Example I. A particle P moves in a fixed plane subject to a conservative 
field of force which has everywhere a positive component towards a fixed straight 
line. 


Let the fixed straight line be chosen as the x-axis, and a perpendicular line 
as the y-axis. In this reversible problem (X = 0) we have y v of opposite 
sign to y and vanishing with y . It will be assumed that the additive constant 
in the potential function y has been chosen so that the particle is confined 
to lie within an oval of zero velocity containing a single segment AB of the 
x-axis. It will also be assumed that y vv is not zero along the axis. 

Under these restrictions we shall show that the surface y' = 0, y > 0 in 
the manifold (28) is a surface of section. 


Firstly, the surface is analytic in that manifold. The equations of this 
surface may be written x' = V2?, y' - 0 with parameters x, y, except when 
we have y - 0. But when we have y - 0, not both y x and y v are zero 
(since double points on an oval of zero velocity were excluded). Hence we 
may take either x, y' or x', y as parameters in this case. 

Along the x-axis the normal component of force y v vanishes. Hence this 
segment is the trace of a periodic orbit formed by the backward and forward 
motion of a particle along AB. Thus the boundary line y' = y - 0 of the 
surface forms a closed stream line in the manifold of states of motion. * 

In order to show that the surface is regularly bounded by this stream line 
it must be established that the stream lines cut the surface y' = 0 in one 
sense and at an angle which is of the same order as the distance of the point 
of intersection from the closed boundary stream line. 

Excepting at points of the manifold of states of motion which correspond 
to a state of motion with velocity zero, proper coordinates for that manifold 

Z, t Van ^ 7 ’ *1 * = arC tan y ' ,X ' ■ regarded as the rect¬ 

angular coordinates of a pent in ordinary space, the closed stream line y = y' 

= 0 will be represented by the straight lines y = 0, * = 0 or ,. The surface 

Trans. Am. Math. Soc. 18 


71 



270 


G. D. BIRKHOFF 


[April 


y' = 0, y > 0 appears as one of the half planes </> — 0 or tt, y > 0. The 
angle which the stream line through a point of one of these half planes makes 
with that plane will be of the same order as the distance from the boundary 
line if and only if <t >' is of the same order as y. But <t>' reduces to ± y v / 
if <t> = 0, 7T by (19). Since y vv is not zero along the x-axis, the angle is of 
the same order as y. At this point we observe also that all the stream lines 
must cut the given surface y' = 0,]/>0ina definite sense save possibly 
along the points which correspond to a position of the particle on the oval of 
zero velocity; for <*>' is of one sign throughout. 

To complete our proof that the given surface is regularly bounded by its 
boundary stream line we need to consider the two points on the boundary 
which correspond to a position of the particle at A or B . Now at these points 
y x is different from zero although y y vanishes. We may take y, x', y' as 
parameters and write the equations of the manifold (28) in the form 

x = F(y,x'* + y’ 7 ) 

where F is analytic in its two arguments. If y , x ', y' be thought of as rect¬ 
angular coordinates, the surface y' = 0 appears as a coordinate plane. The 
line y - y' - 0 appears as the x'-axis in that plane. The distance from a 
point of that plane to the line is y. The angle which a stream line through a 
point of the plane makes with the plane will clearly be of the same order as 
the distance y if dy'/dt or y" is of the same order as y . But the equations of 
motion give y” = y v , a quantity of the order of y. 

Incidentally the above argument shows that the angle between the surface 
y' = 0 and a stream line through any point of it corresponding to a position 
of the particle on the oval of zero velocity is not zero as long as y x is not zero. 
But at such a point we may usex.x'.y' as parameters and proceed as before. 

Our conclusion is that the surface y' - 0, y > 0 is regularly bounded by 
the closed stream line y - y’ = 0 and is cut in one and the same sense by the 
stream lines throughout its extent. 

We observe finally that, since there is always a component of force towards 
the x-axis of the order of y , every orbit will cut that axis in every interval 0/2 
of time (0 being taken sufficiently large). 

Every such orbit will have a direction parallel to that axis once and only 
once between two such points of crossing. It follows that the surface y' = 0, 
y > 0 will be cut by every stream line, at least once in a fixed interval 0 of 
time. 

Hence all of the requirements for a surface of section are satisfied by the 
surface y' = 0, y > 0. 

To each point within the part y > 0 of the oval of zero velocity there cor¬ 
respond two points of the surface of section. At one of these x' is positive 


72 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


271 


and at the other negative. For points of the oval these two corresponding 
points of the surface of section merge. Our complete conclusion is therefore 
the following: 

A surface of section in Example I is y' = 0, y ^ 0. This surface is simply 
connected and has one boundary stream line. 

If the attracting force were due to a number of particles, situated on the 
s-axis and attracting according to the Newtonian Law, the surface y' = 0, 
y > 0 would still represent a surface of section, of’genus zero. This surface 
would have, however, one more boundary than there were particles. 

A thoroughgoing discussion of this case involves the use of a regularizing 
transformation of the variables x, y, and is not made here. 

The following example is designed to show that surfaces of section are 
present in irreversible problems also, and need not be of genus zero. 

Example II. An electrified particle moves in the xy-plane subject to a doubly 
periodic field of normal magnetic force of constant sign.* 

In this problem we have X constant and y doubly periodic and of one sign. 
We will consider two states of motion to be the same, which correspond to 
the particle at congruent points of the network of periods and with the same 
direction of motion. 

The intrinsic equation for the curvature of the orbits as given by (20) 
becomes K = X/ V2>. 

Since y is nowhere zero a suitable set of parameters in (28) is given by x, 
y, <t> - arc tan y'/x' . The surface y' - 0, x' > 0 in the manifold of states 
of motion becomes 0 = 0 in these parameters, and, under our hypothesis 
as to congruent points, is clearly a closed analytic surface of genus 1 . 

Every stream line cuts this surface in one and the same sense, for we have 
0 '>0. Also since the curvature exceeds a definite positive constant in 
absolute value, every orbit will pass through a state of motion 0 = 0 within 
a fixed interval of time d. 

A surface of section in Example II is y' = 0, *' > 0. This surface is 
doubly connected and without boundaries. 

It is interesting to observe that in both of the examples given above, the 
treatment of the surface of section is based on a differential inequality. This 
is most obvious in the second case where the inequality is 0 ' > 0 . 

This phenomenon is an entirely general one. 

24. Lemma on regular boundaries. The difficulties in proving the existence 
of surfaces of section may be considerably diminished by the use of two pre¬ 
liminary lemmas given in the present and immediately following paragraph. 
Lemma I. If a strip S, made up of a finite number of analytic pieces, is 


• It was observed in § 7 that an electrical problem of this description leads 
versible dynamical problem of the type treated in the present paper. 


to an irre- 


73 



272 


O. D. BIRKHOFF 


[April 


bounded on one side by a closed stream line, and if S is cut in the same sense 
throughout by the nearby stream lines, once at least in every interval of time 6, 
then there exists a similar strip S' with the same boundaries as S and regularly 
bounded by the given closed stream line. 

Suppose that the given closed stream line and its neighborhood is deformed 
analytically into a space with rectangular coordinates u, v, w in such wise 
that this stream line goes into the 10-axis. We assume that w is an angular 
variable of period 2 ir, and that the part of the 10-axis between w = 0 and 
w — 2 jt corresponds to the stream line taken once. 

The part of S (as represented in this auxiliary space) near the 10-axis will 
either not wind about this axis as w increases by 2 w or it will do so a certain 
number k of times. In the latter case the further change of variables 

u = u' cos kw — v' sin kw, v = u' sin kw + v' cos kw, w = w' 

will lead to a similar u'v'w' space in which S will not wind about the 10-axis. 
Thus we are at liberty to assume that S does not wind about the tfl-axis as w 
increases by 2 ir. 

Our hypothesis concerning S necessitates now that every nearby stream line 
winds around the tc-axis at least once when t increases by a sufficient amount. 
Let us assume this winding is in a positive sense. Consider the plane w — 0 
and any parallel plane w — d. A stream line from a point P of the first 
plane intersects the second plane in a unique point Q, at least if we consider 
stream lines near the 10-axis only. Thus we define a one-to-one analytic 
transformation from one plane to the other. In particular the trace *of the 
w-axis in the second plane is derived from the trace of that axis in the first 
plane. The directions through this trace in the one plane are transformed 
projectively into the corresponding directions in the other plane. 

When d increases from 0 to 2 ir each one of these directions has been rotated 
through a perfectly definite angle. I assert that the total rotation of every 
direction will exceed a definite positive quantity in numerical magnitude. 
To establish this fact we observe that if the projective transformation of 
directions leaves two directions invariant, the total rotation must include a 
positive rotation through a multiple of 2 x; otherwise every stream line near 
the 10-axis would not wind about that axis in a fixed interval of time. The 
same thing is true if there is one invariant direction or if every direction is 
invariant. On the other hand if there is no invariant direction the trans¬ 
formation of directions is projectively equivalent to a rotation. In any case 
then the angle of rotation will exceed a definite positive quantity. It should 
be borne in mind that the projective transformation is direct. 

Now imagine a line through the 10-axis in the plane 10 = d to rotate about 
that axis at a lesser rate (with respect to change of 10) than the instantaneously 


74 



1917J 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


273 


coincident direction in the moving plane w = d. This moving line generates 
a ruled surface which evidently cuts nearby stream lines in one and the same 
sense. Assume the difference in rates is constant. When this constant 
difference is small the new moving line almost coincides with one of the former 
lines, and its total rotation as d increases from 0 to 2ir is positive. But it is 
evident that, as this constant increases, there must come an instant when 
the total rotation is zero. At this instant the new moving line will generate 
an analytic surface which represents a closed strip Si in the space of the mani¬ 
folds of states of motion regularly bounded by the closed stream line which 
is the boundary of S. Furthermore this strip Si will wind around the stream 
line precisely as often as S . 

Our next step will be so to deform «$ that it will coincide with Si near the 
boundary closed stream line and will not be modified near its other boundary. 

Consider any point of Si, say P , and prolong the stream line which passes 
through P until it meets S in Q. The representation in the ut>u>-space to¬ 
gether with the fact that the two surfaces are cut in one and the same sense 
by nearby stream lines shows that the point P will vary continuously with Q 
(save possibly along the u>-axis). Thus there will be set up a one-to-one con¬ 
tinuous correspondence between the points P of Si and the points Q of a 
part of S. 

Now the distance from P to Q along the stream line PQ will vary continu¬ 
ously with the position of P, and indeed analytically, unless Q happens to 
lie on one of the edges of S. 

Let 0(w) be a function of w defined as follows: 

0(w) = 0 if w < p, 0(w) = (w — p)/(S — p) if p = w ^ 5, 

0(w) - 1 if w > 6 > 0. 

Modify each point Q of S back toward P in such wise as to diminish the 
distance from P to Q along the stream line in the ratio 0(w) to 1, where w 
stands for the distance from P to the nearest point of the closed stream 
line. 

Thus a new strip S' is obtained which will have the desired properties. 
First, it is regularly bounded by the closed stream line since it coincides with Si 
in its neighborhood (0 = 0). Secondly, S' will coincide with S near its other 
boundary (0 = 1). Thirdly, inasmuch as S' is obtained by three analytic 
deformations of parts of S the surface S' is made up of a finite number of 
analytic pieces. Lastly, the strip S' is cut in the same sense throughout 
by the stream lines at least once in a fixed interval 0 since the deformation 
from S to S' merely moved each point P along the stream line on which it 
lies by a certain finite distance. 


75 



274 


G. D. BIRKHOFF 


[April 


25. Lemma on surfaces of section. Our second lemma is the following: 

Lemma II. Let a surface 2, without multiple points and cut by every stream 
line, be made up of a finite number of analytic pieces regularly bounded by a 
finite number of closed stream lines. If the points of 2 may be imbedded within 
a set of arcs AB of stream lines forming a three-dimensional continuum in such 
fashion that each arc AB cuts 2 precisely once more positively than negatively, 
there will exist a surface of section 2' with the same bounding stream lines as 2. 

In fact, let an arc AB cut 2 In the successive points P x , P 2 , • • • , Pn where 
n is an odd integer. Let the corresponding times be denoted by t x ,U, •••,<« 
respectively. The times may be reckoned from an arbitrary point Q of AB. 
Let P denote the point of AB with time coordinate 

t - tn — U -1 + ty, —j — • • • + <1 • 

The point P will necessarily fall within AB since t evidently lies between 
tn and 1 1, and does not depend on the choice of Q. 

It is clear that P varies continuously with AB unless some of the points P< 
approach coincidence and disappear, or new points arise. 

But these points will disappear in pairs, or arise in pairs, and at the same 
instant the corresponding set of terms of t will become equal in numerical 
value and will cancel each other in pairs. Consequently the variation of P 
with AB is continuous throughout. 

Near the boundaries of 2 all of the stream lines cut in one and the same 
sense. Hence there is only a single point Pi of 2 on AB when AB lies near a 
bounding closed stream line, and P will coincide with Pi. 

The point P will vary analytically with AB unless two points P* coincide 
or a point P, falls along the intersection of two of the analytic pieces which 
make up 2. 

These facts show that the locus 2' of the points P is made up of a finite 
number of analytic pieces regularly bounded by the given closed stream 
lines. It is evident that 2' is cut in the same sense throughout by the stream 
lines. 

To complete a proof that 2' is a surface of section we need only show that 
every stream line cuts 2' in a fixed interval B of time. If this were not the 
case it would be possible to find indefinitely long arcs of stream lines, which 
did not cut 2'. An arc MN of this sort cannot approach near a boundary 
stream line, since stream lines cut 2' uniformly often near such a regular 
boundary. Likewise MN cannot contain part of an arc AB save near M or N . 
Consequently if P be the midpoint of MN, the stream line through a limiting 
position P of P (as the length of MN becomes infinite) cannot approach a 
boundary stream anywhere and it cannot contain an arc AB. Hence this 
stream line will nowhere cut the given surface 2, which is contrary to hypo¬ 
thesis. 


76 



1917] 


275 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

If S' is not made up of a single analytic piece it is possible to replace S' 
by a similar surface haying continuity of any prescribed order. Also it is 
certain from the intuitive point of view that S' may be taken to be a single 
analytic surface. To prove this, however, would appear to require an ex¬ 
tensive digression, and the fact does not enter essentially into our later dis¬ 
cussion. For this reason we shall speak of the surface of section as if it were 
composed of a single analytic piece. 

26. Existence of surfaces of section. A special case. In what cases will 
surfaces of section exist? 

A natural method of attack upon this question is to begin with an integrate 
dynamical problem, and pass to more general cases by the method of analytic 
continuation. This method was used by Poincare in treating the restricted 
problem of three bodies (loc. cit.). The lemma of § 25 leads us to see that 
the existence of a surface of section with given boundary stream lines depends 
essentially on whether the totality of stream lines have a uniform tendency to 
wind about these closed stream lines in a particular way. Such a tendency 
is clearly not altered by small variation of a parameter in the dynamical 
problem. 

Our method will be entirely different. We will commence with the dis¬ 
cussion of a simple and particularly important case: 

If in a reversible problem p = 0 with no ovals of zero velocity , there is a peri¬ 
odic orbit without double points which is cut by every other orbit at least once 
in any interval 0 of time, there will exist a ring-shaped surface of section with 
two boundary stream lines corresponding to the given periodic orbit described in 
the two possible senses. 

The given periodic orbit cannot be of minimum type, for then nearby orbits 
could be found which did not intersect it during long intervals of time (see 
§ 14). Consequently it will be possible to imbed the orbit in an analytic 
family of closed curves whose curvature exceeds that of the tangent orbit 
by a quantity of the first order in the distance from the orbit. 

In fact it was seen earlier that if the orbit was taken into the s-axis in an 
jn-plane by a suitable conformal transformation, the quantity I became posi¬ 
tive (see § 19, (a)). Thus Sn " is negative when n is positive (Sn being any 
solution of the differential equation of normal displacement). Hence the 
curves n = const, form an analytic family of the stated type. 

If we use variables s, n, \J/ = arc tan n'/s' for rectangular coordinates of 
the manifold of states of motion, the equation of the orbit is n = 0, ^ = 0 or tt . 
The states of motion corresponding to tangency with the curves n = const, 
form the planes \J/ = 0 or tt . The stream lines through any point of these 
planes are at a distance n from the boundary stream line corresponding to 
the periodic orbit. Since d\J//dt is of the order of n, the set of tangent states 


77 



276 


G. D. BIRKHOFF 


[April 


of motion to curves of the family n = const, (in either sense) are represented 
by two strips which are regularly bounded by one of the two stream lines 
corresponding to the given periodic orbit. 

Now adjoin to this analytic family of curves another family which begins 
with the last member of the first family on one side of the given periodic 
orbit and ends with a point curve. For example, if we imagine the region 
bounded by the last curve of the first family to be conformally thrown into a 
circle, the second family may be taken to be the set of curves represented by a 
set of concentric circles. 

In the manifold of states of motion the states of motion corresponding to 
tangency with a curve of the second family are represented by a surface which 
is everywhere analytic. Indeed, if we use the variables u , v , x “ arc tan v'/u' 
where u, care rectangular coordinates in the plane of the concentric circles 
with origin at the center, the equation of the tangent states becomes 

u cos x' + » sin x' * 0 

which is analytic inu, c, x-space. 

By combining this analytic surface with the two strips obtained above we 
obtain a ring-shaped surface 2, made up of three analytic pieces, and regu¬ 
larly bounded by the two closed stream lines corresponding to the two senses 
of description of the periodic orbit. 

The sense in which a stream line cuts this surface is evidently determined 
by the relative curvature of the curve of the auxiliary family and of the orbit 
at the point of tangency. For imagine the curves of such a family to be 
deformed analytically into a family z - const, in a toz-plane, and that the 
variables w, z, u> «= arc tan z'/w' are employed. The angl e between the 
surface 2 ' = 0 in trza>-space and the stream line has u'/tJw'* + z! 7 + a/ 2 
for its sine, and will vary its sign according to the sign of which deter¬ 
mines the relative curvature of the auxiliarv curve z = const, and the tangent 
orbit. 

If we call the positive sense that in which the tangent orbit is externally 
tangent to a curve of the auxiliary family it is clear that the surface 2 is cut 
positively by the stream line near the boundary stream lines and near the 
line of 2 which corresponds to the point auxiliary curve. 

Now it is apparent that an orbital arc which crosses over the region on 
the characteristic surface covered by the auxiliary curves corresponds to an 
arc AB of a stream line which crosses 2 once more positively than negatively. 
The number of external tangencies will exceed the number of internal tangen- 
cies by unity of course. Thus we have 2 imbedded in a set of arcs AB of 
stream lines which satisfy the condition imposed in the lemma of § 26. 

In virtue of our hypothesis about the given periodic orbit there will be 


78 



DYNAMICAL 8YSTEM8 WITH TWO DEGREE8 OF FREEDOM 


277 


1917] 


such orbital arcs AB on any orbit. We infer that all of the conditions of the 
lemma are satisfied, so that a surface of section exists bounded by the two 
stream lines which correspond to the given periodic orbit set. 

Our result may easily be extended as follows: 

If there is not more than one oval of zero velocity at least on one side of the 
given periodic orbit, a surface of section of the same type as before will exist. 

Suppose that at least one oval of zero velocity is present on each side of the 
given periodic orbit. The preceding method cannot be applied on either 
side of that orbit. In this case by hypothesis there is only one oval on one 
side, and we proceed precisely as before save that the second auxiliary family 
of curves is made to end with the oval of zero velocity instead of with a point 


curve. 

We shall not attempt to give the obvious analytical work necessary to 
establish that the modified surface 2 will be analytic along the curve corre¬ 
sponding to the oval of zero velocity, and will be cut by the stream lines 
positively along this curve. 

27. Existence of surfaces of section in the reversible case. If a finite set 
of periodic orbits in a reversible problem separate the characteristic surface 
into simply connected regions, and if any orbit whatever cuts one at least 
of the set in every interval 0 of time, the periodic orbits will be said to form a 
primary set. 

If in a reversible problem with no ovals of zero velocity there exists a primary 
set of periodic orbits, there will exist a surface of section with boundary stream 
lines corresponding to these orbits. 

Evidently the preceding paragraph deals with a special case. 

In no other case can there be a boundary curve of one of the regions or 
the characteristic surface which forms a single complete orbit. When a 
boundary orbit of this sort exists p cannot be greater than zero, for if p > 0 
the region on either one side or the other of this orbit is multiply connected. 
Also for the same reason there cannot exist other orbits of the primary set 
even in the case p — 0. 

Consequently in every other case there will be at least one vertex on every 
boundary curve of a region, and we will therefore assume such vertices to be 
present. 

It is easy to modify our earlier method to meet this case. 

First we construct an analytic family F\ of curves corresponding to each 
side of a region (Fig. 9) so as to haye a curvature which exceeds that of the 
tangent orbit at each point. To do this we may choose any function n(f) 
for which n" > —In in the jn-plane used above. The family of curves 
along which the ordinate in this plane is proportional to n(f) will have the 
desired property. 


79 



278 


O. D. BIRKHOFF 


[April 


Next we imagine an analytic transformation to be made which takes two 
arcs of boundary orbits crossing at a vertex into two perpendicular straight 
lines, say the axes in a ut>-plane. The second family F 7 of curves .s the 
image of the set uv = const, in the ur-plane (see figure). 



Now from either side of each vertex of a region draw an analytic arc ending 
at a point 0 within the region, and let these analytic curves be so chosen as to 
have no double points and not to meet save at 0 and there at an angle not 
zero. Let us construct as much of each curve F i along each side as lies be¬ 
tween the analytic curves to 0 drawn from points near its ends, and as much 
of each curve F 7 as lies between the nearby curves to 0 . Moreover, let us 
fill each of the regions up to 0 with a third type of analytic family F 3 (see 
figure). 

It is apparent that the families F 3 of curves may be made to meet along the 
curve to 0 at an angle less than ir toward the interior of the region, at least 
if the curves to 0 begin near enough to the vertices. We may conceive of 
them as meeting in this way all along the curve to 0. 

Consider now the tangent states of motion to these curves for all the regions. 
Here we mean to include all states of motion at the point 0 and all states 
at a vertex formed by two curves which yield an orbit on the side of the 
vertex away from the point 0. 

The lemma of § 25 will apply to the corresponding surfaces or surfaces 2 
in the manifold of states of motion. 

We note first that the states of motion along a line of vertices to a point 0 
form an analytic manifold. For let us employ the variables x , y , <f> and let 
a denote the arc length along the curve to 0 . The states of motion in ques¬ 
tion are then given by the equations 

z = x(s) t y = y(s), 4>i(s) = 4> = <t >2 (*), 

where x ( a ), y ( a ), <t> x ( a ), </>* (a ) are analytic in a . This evidently gives a 
piece of an analytic surface. 


80 



DYNAMICAL SYSTEM8 


279 


1917] DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

Secondly, consider the states of motion corresponding to the famUy 
If we employ the variables u , » defined above and 4 = arc tan r /« , tange 
states of motion to a curve uv = const, are given by 

u sin yl + t> cos ^ = 0, 


which is analytic in wr^-space. . _ 

The other pieces arising from F, and F 3 may be treated as in the preceding 

Pa Thus P the surface 2 is made up of analytic pieces. It is clear that 2 is 
without multiple points, and that its boundaries are the closed stream lines 
representing the (doubly taken) periodic orbits of the primary set. 

In order to distinguish between the two sides of 2 we note that the vector 
at a point representing a state of motion may be directed inside of or outside 
of the auxiliary curve which passes through the same point. If the point 
falls at a vertex where two such curves meet, the vector may be directed toward 
the interior angle or the opposite angle, or lie outside of them. 

For a state of motion near to 2 we have a direction nearly coincident with 
that of the auxiliary curve through the point, at least if the point does not lie 
near a vertex. If the point lies near a vertex formed by two auxiliary curves 
its direction must be nearly coincident with one of the tangent directions at 
the nearby vertex. If the point lies near 0 any direction gives a state of 
motion near 2, since the direction at 0 of tangent directions is arbitrary. 

It is now clear that if we define the positive side of the surface 2 as that 


side for which the direction is outward from the auxiliary curves, and the 
negative side as that for which the direction is inward, we obtain a consistent 
definition for the part of 2 which corresponds to a single region. In fact, it 
is then possible to pass from any point near one side of 2 to any other nearby 
point on the same side through nearby points on that side. 

The part of 2 which arises from the neighborhood of a vertex has been seen 
to be analytic. Moreover the stream lines evidently cross 2 here from the 
negative to the positive side, since the tangent orbits are externally tangent 
to the auxiliary curves. Thus the definition given is consistent throughout. 

With this definition in mind it becomes evident that all stream lines which 


cross 2 near one of the boundary stream lines will cross from the negative to 
the positive side. 

Now associate with each point of the surface 2 a segment AB of a stream 
line which corresponds to an orbital arc A'B ' which extends on each side of 
the corresponding point of tangency until it reaches the boundary of the 
region on which the point of tangency lies and is of arc length as large as a 
fixed small quantity d. I:; this way every point of 2 is imbedded within a 
stream line AB which varies with the point of 2, even when the corresponding 


81 



280 


G. D. BIBKHOFF 


[April 


point of the characteristic surface passes from one region to another through a 
vertex formed by two orbits. 

It is precisely the fundamental property of primary sets of periodic orbits 
which allows one to infer that the tangent orbit will cut the sides of the 
region, so that AB remains of limited length throughout. 

The part of the orbital arc A'B' (if any) which lies outside of the region con¬ 
taining the point of tangency cannot itself be tangent to an auxiliary curve. 
In fact, if it does extend outside, there is a point of tangency on A'B' within 
the region and near the boundary, by definition of AB . But stream lines 
lying near the boundary of 2 cut it in one and the same sense, and therefore 
not twice in a short interval of time. Interpreting this result we perceive 
that A'B' cannot be tangent again to an auxiliary curve outside of the region. 

Thus the surface 2 is imbedded by the segments AB of stream lines in the 
sense required for the application of the lemma of § 25. 

Moreover a stream line AB, no matter how complicated its form may be, 
will always cut 2 precisely once more positively than negatively. For A'B' 
becomes tangent to an auxiliary curve once more externally .than internally 
in crossing a region of the characteristic surface. 

Also according to the lemma of § 24 we can replace the boundary strips of 
2 by strips regularly bounded by the same closed stream lines. 

The modified surface 2 so obtained will satisfy all of the restrictions imposed 
in the lemma of § 25 provided that every stream line cuts it. We proceed 
now to complete our proof by establishing that every stream line will cut 2. 

Since the parts of surfaces 2 corresponding to opposite regions at a vertex 
hang together along a line, all of 2 corresponding to the vicinity of a vertex 
will form two pieces. Proceeding now to an adjoining vertex we are able 
to infer at once that the part of 2 corresponding to the abutting regions con¬ 
sists of at most two pieces. Continuing this process we finally conclude that 
there are at most two surfaces 2. 

The case when 2 consists of one surface is at once disposed of. Every 
stream line cuts 2 at least once as the corresponding orbit crosses a region. 
It is precisely the fundamental property of primary sets of periodic orbits, 
which allows us to infer this. 

There can be two surfaces 2 only when two of the four regions of the char¬ 
acteristic surface which abut upon any vertex yield one part of 2 and the 
other two yield the other part. If this were not the case all of the four regions 
would belong to one part of 2 and the above argument would establish that 
there is but one surface. If then 2 consists of two pieces it will have boundary 
stream lines corresponding to each of the primary set of periodic orbits taken 
in either sense, whereas if it consists of a single piece each such stream line is 
used twice as a boundary. 


82 



1917] 


281 


DYNAMICAL SY8TEM8 WITH TWO DEGREES OF FREEDOM 

Our previous argument is seen to apply without substantial modification 
to this case, once it has been observed that any orbit passes from one region 
into another adjacent to it at a vertex, and thus becomes tangent to auxiliary 
curves corresponding to both parts of 2. 

An extension of these results is easily made (compare with § 26): 

If there is not more than one oval of zero velocity in any region into which the 
periodic orbits of the primary set divide the characteristic surface, a surface of 
section of the same type as before will exist. 

The connectivity of the surface of section obtained by the above construc¬ 
tion evidently depends merely upon the relative disposition of the points of 
intersection of the orbits of the primary set. 

28. Existence of primary sets of periodic orbits. The application of the 
results of § 27 requires the existence of primary sets of periodic orbits. We 
now proceed to establish the existence of such sets under certain conditions. 

If in a reversible problem p = 0 there is no oval of zero velocity and no periodic 
orbit of minimum type without double points, the periodic orbit of minimax type 
known to exist (§ 17) forms a primary set. 

This periodic orbit clearly divides the characteristic surface into simply 
connected pieces. 

If an orbital arc can be found corresponding to long intervals of time which 
does not intersect this periodic orbit, such an arc cannot approach it, for every 
nearby orbital arc intersects the orbit of minimax type (<r + ®) at least 
once in every interval 6 of time. Consequently there will exist a limiting 
orbit which pasSes through a limiting position of a midpoint of one of these 
arcs with a limiting direction, and which never intersects the orbit of minimax 
type. The orbit of minimax type and this limiting orbit form the concave 
boundaries of a ring within which a periodic orbit of minimum type without 
double points can be found (see §§ 8, 9). 

On account of our assumption that no such orbits of minimum type exist, 
we conclude that every orbital arc intersects the periodic orbit of minimax 
type in an interval 0 of time, so that this orbit forms a primary set. 

If a periodic orbit of this minimum type exists, any primary set of periodic 
orbits must evidently contain a periodic orbit which cuts, the orbit of mini¬ 
mum type. In fact, nearby orbital arcs can be found which do not cut it 
for an indefinite length of time. 

Our earlier tests do not yield orbits of this kind, at least in certain cases. 

For example, a dumb-bell-shaped solid has one such orbit of minimum type 
in its equatorial plane, and two orbits of minimax type, one on each side of 
the orbit of minimum type. It is possible to infer the existence of infinitely 
many other orbits of minimax type winding around either end of the dumb¬ 
bell by our earlier methods. But these methods seem insufficient to secure 


83 



282 


G. D. BIRKUOFF 


[April 


an orbit which cuts the orbit of minimum type, although this is necessary 
and must be done before a primary set can be found. 

In this particular problem any plane through the axis of the dumb-bell 
intersects it in a periodic orbit which every other orbit cuts in a fixed interval 
of time. Thus a primary set does exist in this case also. 

If in a reversible problem with p > 0 there is no oval of zero velocity, a pri¬ 
mary set of periodic orbits will always exist. 

To show this, let us begin by drawing a set of closed curves in the charac¬ 
teristic surface which divide that surface into simply connected pieces no 
matter how these curves may be deformed. The set L of orbits of minimum 
type deformable into these curves will exist (§ 9) and will divide the charac¬ 
teristic surface into simply connected regions. 

If every orbit cuts one of the orbits L within any interval 0 of time we 
have before us the desired set of primary periodic orbits. 

In the contrary case there must be orbits which fail to cut any orbit of the 
set for any arbitrary length of time. An orbit of this type cannot approach 
an orbit of the set L which is intersected by other orbits L. For then it 
would cut these other periodic orbits of the set. Thus an orbit of this type 
can only approach an isolated periodic orbit of the set L. But there can be 
no isolated periodic orbit L for p > 0 since that would imply doubly con¬ 
nected regions on the characteristic surface on one side or the other of the 
isolated periodic orbit. 

Therefore we may assume that there exist orbits which lie wholly within 
some region formed by the set L, and which do not approach its boundary. 
Such a complete orbit 0 may be obtained by constructing orbital arcs corre¬ 
sponding to greater and greater intervals of time and not crossing an orbit L. 
A complete orbit which passes through a limiting position of the midpoint 
of these arcs, with a limiting direction, will obviously be a complete orbit of 
the-stated kind. 

Consider now any region within which a complete orbit 0 lies. On opposite 
sides of the region we may draw two lines in the characteristic surface so taken 
as to be deformable into one another but not to a point. The boundary 
formed by the totality of orbits 0 is a concave boundary towards the part 
of the surface in which these two lines lie (§ 8). Hence we can find two peri¬ 
odic orbits o\ and o* of minimum type, one on either side of 0 and deform¬ 
able into these two lines respectively. 

These two orbits of minimum type yield an orbit 03 of minimax type de¬ 
formable into 01 or o 2 (§ 18), such that if J = M along this orbit it will be 
possible to pass from Oi to o» with J < M + € (e small) but not with J <M . 

At least one of the totality of periodic orbits inclusive of o 3 which may 
be obtained from o a by continuous deformation under the restriction J ^ M 
will intersect every orbit 0. 


84 



1917] 


283 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 

If this is not the case, we may assume o 3 on one side of some such orbit 0, 
sav on the side toward o,. The orbit 0 cannot have o, as a limiting orbit 
inasmuch as 0 would not lie wholly within the given region in that case. 

The orbit o 3 divides the curves J ^ M deformable to 0 \ or o 2 into two 
classes. The orbit o, belongs to one of these classes. Consider the curves 
J < M in the other class and lying on the sarae^side of 0 as o 3 . In this class 
there is a periodic orbit of minimum type o 4 , which cannot be a limit orbit of 
0 of course. 

According to the principles of § 18 it will be possible to deform the curve 
o 4 into o 2 with J ^ M and without approaching o 3 . 

We may now repeat the preceding argument using o 4 in place of o». We 
are led to an orbit o* of minimax type along which J = AT ^ M and with 
the further property that it is possible to pass from o 4 to o 2 with J < M' -f « 
but not with J < J/'. If o 4 does not intersect 0 we are again led to an orbit o« 
of minimum type on the same side of Oaso 4 . 

This process may be indefinitely continued, and will lead to an infinite num¬ 
ber of orbits of minimum and minimax type which may be obtained from Oy 
or o 2 by deformation under the restriction J ^ M . 

Hence one of the totality of periodic orbits will intersect 0 unless there are 
an infinite number of such orbits with J SS 3/. In this exceptional case there 
will be a finite number of analytic families of periodic orbits. We shall not 
attempt to consider this possibility. The methods of § 18 indicate how it is 
to be treated. 

Therefore, if we adjoin to the set L the orbits of minimum and minimax 
type with J ^ M for each region, the resultant set forms the desired primary 
set of periodic orbits. 

29. Reduction to a surface transformation T . Suppose now that a surface 
of section 5 exists in the particular dynamical problem at hand, which may be 
reversible or irreversible. 

Consider an arbitrary point P of that surface. If we follow along the 
stream line through P , in the sense of yicreasing t , to the first following point 
of intersection, we get a definite point Q. The transformation T of the 
surface of section which we shall consider is that which takes each point P 
into the corresponding point Q, and we shall write Q = T (P). 

It is a self-evident consequence of the definition of S that Q varies analyti¬ 
cally with P, that for any point Q there is a unique point P, and that P and Q 
approach the boundary stream lines together.* 

• If the surface of section consists of more than one analytic piece, and P or Q lies on an 
edge, our statement that P varies analytically with Q will be interpreted to mean that by a 
slight deformation of the surface of section about the points P and Q the transformation may 
be given analytic form. A similar convention is needed later. We shall always speak of the 
surface of section as though it were a single analytic surface. 


85 



284 


O. D. BIRKHOFF . 


[April 


Let us prove that the transformation T is analytic along the boundaries of 
the surface of section also. In order to do so with the greatest possible dis¬ 
patch we note first that by an analytic deformation of the manifold of states 
of motion we may take the boundary stream line under consideration into a 
straight line and at'the same time take the surface of section into a plane 

containing that line. f . 

If we take the line as the 2 -axis in an xyz- space, and the surface of section 
as the plane y = 0, the differential equations of the stream lines may be 

written 

F(x,y,z), £=G(x,y,z), 

where we take z as the independent variable. This is permissible because 
the stream lines are nearly parallel to the z-axis. The functions F and G 
are of course analytic in their arguments. 

The general solution (x, y) of these equations which reduces to (x 0 , y 0 ) 
f or 2 = z 0 may be written 

x -/(z,x 0 ,yo,zo), y - g(z,x 0 ,yo,zo)t 

where / and g are analytic in the indicated arguments. The stream line 
which passes through the point (x 0 , 0, z 0 ) of the plane y - 0 may therefore 
be written . 

x =*/(z, x 0 , 0, z 0 ), y - g (z, *o» 0, Zo) . 

The point (x,, 0, z x ) where that stream line pierces the plane y - 0 at a 
later time satisfies the pair of equations 

xi = / ( zi, Xo, 0, zo), 0 *= g (zi, xo, 0, zo). 

The second of these equations determines z\ as a function of x<> and z 0 . If zi 
as thus determined is analytic in x 0 and z 0 for x 0 small and z 0 arbitrary, then 
by the first equation x x will also be analytic in x 0 and z 0 . Recalling the 
meaning of the variables x, y t z here, we see that we need only to show that 
zi is analytic in x 0 and z 0 . 

The function y(z,x 0 ,0,z 0 ) vanishes identically for x 0 = 0 since the 
z-axis is a stream line. Hence the function g contains a factor x 0 . If this 
factor be removed, the equation g/x 0 = 0 can be solved for z x as an analytic 
function of x 0 , z 0 provided that the z derivative of the resulting quotient g/x 0 
does not vanish for x 0 small and z arbitrary. But the value of this derivative 
along the axis is 

(29) 9m+(z, 0, 0, z 0 ). 

Therefore, if we can establish that this quantity is not zero for z = z lf the 
transformation T will be analytic along the boundaries also. 


86 



1917 ] DYNAMICAL SYSTEMS WITH TWO DEGBEE8 OF FREEDOM 285 

Now the pair of functions 

bx =/*o(z, 0, z 0 ), by = g^(z, 0, 0, z 0 ), 

form a solution of the differential equations of displacement from the z-axis 
in xyz space, 

^ = F,Sx + F,6y, G,Sx + G,Sy. 


This solution is obviously identified as the solution fulfilling the initial con¬ 
ditions bx = 1, by = 0 for z = zo. These are obtained by differentiation of 
the initial conditions / ■= x 0 , g — 0 with respect to xq. The quantity (29) is 
seen to equal by r . 

At z = zi we have g equal to zero so that the function by is small near z = Zi. 

Now, if G x =# 0, the second of the differential equations for bx , by may be 
solved for bx and the result substituted in the first equation. In this way 
there results a linear differential equation for by of the second order, with 
coefficient of by" not zero. Hence by changes sign in the vicinity of a point 
where |$y| is small; and by' is not small near such a point. The character of 
the initial conditions on by at z => z 0 is to be borne in mind. Accordingly 
our proof will be complete if it is shown that G M (0, 0, z) + 0 for any z. 

But the stream lines cut y - 0 at an angle which is of the order of x 0 because 
the stream lines cut the surface of section at an angle of the same order as 
the dist ance from th e boundary stream line. The angle with y - 0 has a 
sine G/ Vl + F 2 + G 1 . This is of the same order as xo if and only if G x is 
not zero along the z-axis. 

A final property of T (noted by PoincarS, loc. cit.) which plays an im¬ 
portant r6le in the sequel is that it possesses an invariant area integral ff pda. 
In order to see this we consider a small surface element of S , say AS. The 
tube of stream lines erected on this element as base may be continued until 
they intersect the surface of section in a second element AS. These two 
surface elements bound the part of the tube to which we confine attention. 
The second element is obtained from the first by the transformation T. The 
rate of flow across the two boundaries of the tube is the same, at least if we 
employ ary^-space, since the motion is that of an incompressible fluid. This 
rate is approximately measured by the normal velocity at any point of the 
element multiplied by its area. Hence if p denotes the normal velocity the 
exact rate of flow across any element is measured by ff pda. 

The function p is obviously analytic save when we are considering a point 
of either end of the tube which is derived from a point of zero velocity on the 
characteristic surface, so that the variables x, y, <t> fail. If, however, we 
slightly displace one or both elements so that neither involve such a point 

Tr*n«. Am. Mat'.. 8o«. 19 


87 



286 


Q. D. BIRKHOFF 


[April 


we obtain a modified function p, which is analyt.c and the displacement 
back to the first position will merely modify p. to p by multiplying p, by an 
analytic factor. Hence p is always analytic. Furthermore p .3 clearly 
positive throughout save along the boundaries where it vamshes smce the 
normal velocity is zero there. 

Our results may be summed up in the following conclusion: 

The transformation T of the surface of section S is a one-to-one analytic trans¬ 
formation of S throughout, which possesses an invariant area integral ff pda 
where p « everywhere analytic and is positive except along the boundary stream 

lines where p vanishes. .. » 

In my earlier paper on the restricted problem of three bodies (loc. cit.) 

I pointed out that the problem presented by a transformation T vs equivalent 
to that presented by the dynamical problem with which we start. The trans¬ 
formation T, however, involves essentially only one arbitrary function (since 
by a deformation of S the transformation T can be made to become an area- 
preserving transformation of a fixed surface), whereas even in the form (1 ), 
(4') of the equations of motion, two arbitrary functions, namely X and y, are 
involved. We have then here a genuine reduction of the problem from both 

an analytic and qualitative point of view. 

In the present paper we shall only make application of the transformation 
T to the periodic orbits. Such orbits correspond to invariant points of the 
characteristic surface under the transformation T or its iterations. 


Part IV. Periodic orbits and the transformation T 

30 First theorem on invariant points. We will fix attention at first upon a 
closed analytic surface S, which admits a one-to-one analyt.c sense-preserving 
transformation J, into itself that is not assumed to possess an invariant area 


In order to state concisely our result concerning the invariant points of 
such a surface S, under the transformation T, we need to make a classification 
of invariant points. Let u, s be regular coordinates of the surface in the neigh¬ 
borhood of an invariant point u = » = 0 of S. The coordinates (u , v ) 
of the transformed point (u, *) are then expressible in the power series of 

the form 


u' = au + + 


cu + dv + 


(ad — be > 0). 


If the roots pi and p 2 of the characteristic equation 

|°-p 6 1 = 0 

c d — p| 

are both different from 1, the invariant point is said to be a simple invariant 
point. Otherwise it is said to be a multiple invariant point. 


88 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


287 


If pi and p 2 are real then both are of the same sign since their product is 
ad — be > 0. A simple point for which we have 0 < pi < 1 < Pi will be 
called a directly unstable invariant point. If we have pi < — 1 < Pi < 0 , the 
invariant point will be said to be inversely unstable. All other simple invariant 
points will be called stable. 

The usefulness of these definitions lies in the fact that only in the unstable 
case do some points rapidly approach the invariant point, while others rapidly 
recede from it, with iteration of T x .* 

By a slight modification of a multiple invariant point can evidently be 
decomposed into simple invariant points, k of which are stable or inversely 
unstable, and / directly unstable, say. In this case we will agree to make the 
convention that the multiple invariant point is counted for k stable or inversely 
unstable, and / directly unstable invariant points. It is not implied here that k 
and / are necessarily the same for all modes of decomposition, although later 
work will show that the difference k — l has a value independent of the mode 
of decomposition. 

First theorem on invariant points. If a one-to-one analytic transforma¬ 
tion T i of an analytic closed surface S\ of genus q can be generated by a deforma¬ 
tion of the surface into itself , the difference between the number of directly unstable 
and other invariant points is 2q — 2. 

Proof. We will proceed first upon the assumption that the deformation 7\ 
is so slight that a unique short geodesic arc may be drawn from any point P 
of the surface to its image T\(P). 

If we associate with the point P the direction of this unique geodesic we 
obtain a set L of line elements, defined at every point of Si save at the invariant 
points, and varying analytically with the point P. 

Concerning such a system of line elements we have the following lemma 
essentially due to Poincar£.f 

Lemma. Let a system of line elements on a closed surface of genus q have a 
certain number of points of indetermination P x , P 2 , • • • , P n . Let the total' 
rotation of the direction element when a small positive circuit of P, is made 
be 25, tv (5, an integer). Then we have £5,- = 2 — 2 q. 

This equality is applicable to the set of line elements L. In order to make 
this application we shall determine what the numbers 5 are for the various 
types of invariant points. 

In the neighborhood of a simple directly unstable invariant point P 0 of S x 
let us project S t upon the tangent plane at P 0 , and let u, v be rectangular 
coordinates w ith origin at the invariant point in that plane. A suitable ori- 

* In this general connection see a paper by Levi-Civita, Annali d i matcmatica 
ser. 3, vol. 5, pp. 221-307. 

t Journal de mathdmatiques, ser. 4, vol. 1 (1885), pp. 203-208. 


89 



288 G. D. BIRKBOFF [April 

entation of the axes in that plane may be made which will take the trans¬ 
formation T\ into the normal form 

u' — pi u + • • •, t>' — pt t» + • • • • 

This follows at once from the well-known theory of the linear transformation. 

If the point P = (u t v) describes a small circle about the origin, say of 
radius «, in the u»-plane, the point T t (P) = (u', v') will then describe a 
small approximate ellipse with center at the origin and of semi-major and 
semi-minor-axis pi € and p 2 < respectively. The line from P to Ti(P) will 
make an angle with the u-axis with tangent 

v' -v = (Pt - l)t» 4- • 
ti' — u (pi - l)tf+ 

and will rotate through the angle — 2ir inasmuch as (p 2 — 1 )/(pi - 1) is a 
negative quantity. 

But up to terms of higher order the direction of this straight line in the 
up-plane will be that of the corresponding geodesic on Si. 

At a simple directly unstable invariant paint the number 6 is — 1. 

Consider now a simple inversely unstable or stable invariant point for 
which pi and p 2 are real. Here either both pi and p 2 are positive and less 
than 1, or both positive and greater than 1, or both are negative. The 
tangent of the angle of inclination of the line joining P to T\(P) in the u, v- 
plane has the same form as before for pi + pa, but (pa — 1)/(Pi — 1) is 
now positive. We conclude that 5 is 1 in these cases. The exceptional case 
pi « p 2 may be treated in a similar fashion by aid of the normal forms in this 
case, and leads to the same result. 

It may, however, happen that pi and p 2 are conjugate complex quantities. 
In this case the linear transformation has the real form 

u' = k (u cos <r — v cos <r) + • • •, r' = k (u sin a + v cos <r) , 

where u', v' denote oblique coordinates in the tangent plane and k is positive. 
If k = 1 the transformation is essentially a rotation near the origin so that 
the line from P to Ti(P) rotates through 2 jt when P rotates once around 
the origin in a positive sense. If k 4= 1 we have essentially a rotation com¬ 
pounded with a radial contraction or dilation. In this case also it is apparent 
that the rotation is 2x. 

At a simple inversely unstable or stable invariant point the number 8 is 1. 
Thus for a small deformation and the case of simple invariant points we can 
infer the truth of the theorem immediately from the lemma. 

Moreover, the general case of multiple invariant points is a limiting case of 
simple invariant points. Since the rotation number 5 around a curve not 


90 



DYNAMICAL 8YSTEM8 WITH TWO DEGREES OF FREEDOM 


289 


1917 ] 


through an invariant point is not altered by a small modification of Ti, we 
infer by a limiting process that each 8 represents the difference between the 
number of inversely unstable or stable points and directly unstable points which 
coalesce. Thus, if T\ is generated by a small deformation of Si, the equality 
of the theorem holds. 

In order to extend the above proof to the case of an arbitrary analytic 
deformation it is evidently sufficient to set up a system of line elements L 
which has its points of indetermination at the invariant points of Si, and 
which is such that the geodesic direction from a point of Si near an invariant 
point to its transformed position differs from the line element direction by 
an angle which approaches zero with the distance from the invariant point. 

We shall begin by setting up such a set of line elements in the cases q = 0 
and q -= 1 which present special features. 

In the case q — 0 let us map Si upon a complex plane so that the point 
at <*> does not correspond to an invariant point. With each point P of that 
plane we associate the straight line which joins P to T\(P). This yields a 
set of line elements defined over the complex plane, save at the point A which 
goes into oo , and at the invariant points. 

Now return to the surface Si on which we map this system of line elements. 
Thus we obtain a set of line elements L' indeterminate at the invariant points 
and at the images of <*> and A . 

As a point makes a positive circuit about A in the plane its image will make 
a circuit about » in a negative sense. We recall that the transformation 7\ 
is direct. Hence the number 8 associated with A is — 1. 

Likewise as a point describes a large circle in a negative sense about the 
origin (which corresponds to a small positive circuit about » on «Si), the 
image will be a nearly fixed point in the finite plane. The line joining the point 
to its image will rotate in the same negative sense through an angle — 2tt so 
that the 8 for » is 1. 

Applying now the equality of the lemma to the system of line elements L ', 
we observe that the two numbers 8 arising from the points of indeterminateness 
A and « cancel. Noting further that the set L' has the geodesic property 
demanded near invariant points, we infer that the sum of the numbers 8 for 
the invariant points of T x measures precisely the difference between the 
number of directly unstable and inversely unstable or stable invariant points. 
Thus the theorem is valid for 7 = 0 . 

The construction of a set L of line elements in the case q = 1 is still simpler. 
The characteristic surface in this case may be mapped upon a set of congruent 
rectangles in such wise that congruent points correspond to the same point of 
the characteristic surface. We will define the direction at each point as that 
given by the straight line which joins a point to its image in the plane. Since 


91 



290 


G. D. BIRKHOFF 


[April 


this direction is the same at all congruent points we get a single, direction at 
each point of the characteristic surface. The points of indeterminateness will 
evidently be furnished by the invariant points of T x . 

For q > la similar method may be employed. The surface Si may be 
mapped upon the plane of non-euclidean geometry in space of negative 
curvature so as to yield a network of congruent polygons which fill the entire 
plane. Corresponding to the continuous deformation T x of the surface Si 
we have a continuous deformation of the non-euclidean plane in which each 
polygon undergoes a congruent relative deformation. If now we take a set 
of congruent directions at a set of congruent points it is clear that the straight 
line joining each of the points to its image will start from the same relative 
position and rotate through the same angle during the deformation. Hence 
if we consider the set of line elements in the non-euclidean plane which indi¬ 
cate the direction from a point in its initial position to that point in its final 
position, we obtain a set of line elements, one for each point of the surface Si, 
and indeterminate only at the invariant points under T\. 

It should be observed that in these cases q > 0 the set L may be looked upon 
as furnished by a set of geodesics joining a point to its image. 

Thus the theorem holds for q > 0 also. 

In order that T x may be taken as the result of a deformation it is clearly 
necessary that every closed curve on Si is carried into a curve which may be 
deformed back to its first position. This is an equivalent form for the hypo¬ 
thesis of the theorem. 

For the application which we have in view a slight extension of the theorem 
is required: 

If Si is not closed but possesses a finite number d of analytic boundaries which 
are carried into themselves by T x in such a way as to leave no point of the boundaries 
invariant, then the difference between the number of directly unstable and other 
invariant points is 2q — d — 2. 

In order to justify this extenson we need merely to state a slight generaliza¬ 
tion of the lemma of Poincare: 

Let a system of line elements on a surface of genus q with d simply con¬ 
nected R lt Ih t • • •, Rd regions removed contain a certain number of points 
of indeterminateness Pi, Pi, • • •, Pn . Let the total rotation of the direction 
element when a positive circuit of P, is made be denoted by , and let the 
rotation of the line element when a small positive circuit of P< is made be 
denoted by . Then we have = 2 — 2 q. 

The proof is made exactly as in the earlier case; each boundary plays the 
part of a stable invariant point. 

When this modified lemma is applied to Si, the stated extension follows 
at once. 


92 



291 


1917] 


DYNAMICAL 8YSTEMS WITH TWO DEGREES OF FREEDOM 


31 on the continuous case. It is very interesting to inquire what may be 
inferred concerning the invariant points when T x is merely assumed to be 
one-to-one and continuous, although this case does not arise in the dynamical 

problem. . . 

In the continuous case there is at least one invariant point for q * 1. 

For q = 0 this result is due to Brouwer (loc. cit.). For q * 1 it is an im¬ 
mediate corollary of our method which really required merely that T x be 
analytic at the invariant points. If there were no invariant points, we should 
thus be led to a contradiction at once. 

It is interesting to note that there may be only one invariant point for 


q 1 • 

In the case q = 0 this is evident since a translation of the points of the plane 
projects stereographically into a transformation with one invariant point on 
the sphere. 

We shall give an example in order to establish the truth of the statement 
for q > 1. 

Consider a surface of genus q > 1 cutting both the vertical and horizontal 
planes in q + 1 ovals as in the figure (Fig. 10). In each of the four regions of 
the surface formed by these planes we may construct a set of stream lines of 
which the ovals noted are limiting stream lines and which have a determinate 
direction varying continuously with position save at the points forming the 
intersection of two of the ovals. 



Fio. 10. 


Now suppose each point to move along the stream line on which it lies by a 
distance which varies continuously on S x but tends toward zero as the point 
approaches a point of intersection of two ovals. This construction evidently 
yields a one-to-one continuous deformation T[ of Si with invariant points 
precisely the points of intersection of the ovals. 

The peculiarity of T\ which we wish to use is that all of these invariant 


93 


292 


O. D. BIRKHOFF 


[April 


points can be joined by an arc PQ (see Fig. 9) without double points which 
is made up of arcs of stream lines lying in the two planes. 

Now imagine the surface Si to be covered with a membrane so that T\ 
may be thought of as affecting a certain transformation of the points of the 
membrane in Si. Let the arc PQ of stream lines through all of the invariant 
points be pinched to a point while the remainder of the membrane is con¬ 
tinuously deformed. The transformation T\ of the modified membrane will 
then leave only one point invariant, namely that one which corresponds to 
all of the original invariant points. I assume that the deformation of the 
membrane has been so made that the membrane does not overlap itself. 
The corresponding transformation of 6 X has then the desired properties. 

In the case q = 1 there need be no invariant point. For example, a slight 
rotation of an anchor ring about its axis displaces every point. 

32. Application of the first theorem. Consider an arbitrary surface of sec¬ 
tion 5 in a dynamical problem and the associated transformation T . The 
boundaries of S are taken into themselves by T, and the essential nature of 
the transformation along such a boundary depends on the rotation number. 
If the rotation number is not zero no points of the boundary can be invariant. 
We shall assume that these rotation numbers are not zero. 

We make this restriction in order to simplify the form of statement of our 
results. 

In order to apply the extended form of the first theorem to the transforma¬ 
tion T of S , we must know that T may be obtained by a deformation. This 
is true in all cases g » 0 of course. It is also necessary to know what types 
of periodic orbits correspond to stable and unstable invariant points. By a 
stable periodic orbit is meant one such that the solutions of the differential 
equation of normal displacement remain finite. All other periodic solutions 
will be called unstable.* 

Evidently a displacement along the stream lines of the part of S near an 
invariant point will not affect the character of that point. As before let us 
take the periodic orbit to fall along the x-axis in an xy-plane, and let us 
assume that the surface of section is formed by the set of states of motion 
x = 0, and that 0 ^ t ^ r represents the complete orbit. A suitable set 
of coordinates is then y (0), y' (0) which we will denote by u, v respectively. 
Now if 6yi, 6y 7 stand for the solutions of the differential equation of normal 
displacement satisfying the initial conditions 

«yi(0)-l f by'\ (0) = 0, «i/2(0 )= 0, 6yH0) = 1, 

then we have (§ 14) 

• See Levi-Civita, loc. cit. 


94 



DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


293 


1917] 

dyi(t + r) - aSyi(t) + bSy 2 (t), 


by 2 (t + r ) = c5yi ( t ) + d$y 2 ( O * 


where we have ad — be = 1. Consequently the solution u8y x + nSy 2 will be 
replaced by (au + cv) 6y x + (&u + dv) Sy 2 when x increases by S . Hence 
the transformation T has the form 


u' = au -f- c© -f- • • •, c' = bu + dv + • • • ( ad — be = 1), 


where a,b,c,d retain the same meaning. 

Hence, if the invariant point is simple (p* #= 1 , p 2 + 1) , the corresponding 
periodic orbit is simple, since the condition for a periodic solution of the 
equation of normal displacement is that either p x or p 2 is 1. 

A periodic orbit for which pi and p 2 are positive and not equal to 1 evi¬ 
dently corresponds to a directly unstable invariant point. Such an orbit 
will be termed directly unstable, and will have the characteristic property that 
the multiplicative solutions of the equation of normal displacement are 
affected by real positive multipliers when a circuit of the orbit is made. In¬ 
versely unstable and stable orbits may be similarly defined. 

In the case when pi *= p 2 — 1 we shall adopt the convention that the 
multiplicity of the corresponding periodic orbit is the same as that of the 
invariant point. 

If the transformation T of the surface of section S may be obtained by a con¬ 
tinuous deformation, and if no rotation numbers along the boundaries are zero, 
the difference between the number of directly unstable and the other periodic orbits 
is 2q — 2 — d, where q is the genus of S and d the number of boundaries. 

It is scarcely necessary to remark that the first theorem may be applied 
also to periodic orbits which correspond to points of the surface of section 
which are invariant under some definite power of T . 

33. An extension of the first theorem. The application of the theorem of 
§ 32 is only possible when the transformation T can be obtained by a con¬ 
tinuous deformation of S. This hypothesis is not satisfied in all cases. As a 
matter of fact it is not satisfied by the transformations T belonging to the 
surfaces of section given in § 27 for p > 0. 

An extension of the theorem may then be used. This will be only presented 
briefly. 

Let us call all transformations 7 of 5, derivable from one another by a 
further continuous deformation, of the same class. 

If we vary a transformation of the class from one member Ti to another T 2 
by variation of a parameter which enters analytically, the invariant points 
will appear or disappear in pairs, after the fashion of points (x, y) defined 
as the so ution of a pair of analytic equations containing a parameter. It is 
assumed of course that the given transformations T x and T 2 are analytic. 


95 



294 


O. D. BIRKHOFF 


[April 


When invariant points appear or disappear, an equal number of directly 
unstable and stable or inversely unstable invariant points combine. I'or 
the number 5 taken around a region is fixed when such points appear or dis¬ 
appear within the region. Hence, by definition, as many directly unstable as 
stable or inversely unstable invariant points appear or disappear within the 
region. 

Moreover, by definition it is possible to vary continuously from any one 
transformation of a class to any other. That is, if T\ and T 2 are of the same 
class, we may write T 2 = T\ T, where T stands for a deformation of the 
surface into itself. But in § 30 it was shown that a set of analytic curves, 
analogous to geodesics, could be found for p > 0 joining each point P to its 
image T(P). If now we imagine each point P to move along this curve 
with uniform velocity in such a way as to reach T (P) after a second of time, 
a transformation T t (P) is generated which is analytic in t. Also the trans¬ 
formation T t will coincide with T x for t - 0, and with T 2 for t = 1. Thus we 
may assume that T\ is carried into T 2 analytically whenever T x and T 2 are of 
the same class. 

We are thus brought to the following conclusion: 

For all one-to-one analytic transformations T of the same class on a surface S 
the difference between the number of directly unstable and stable or inversely un¬ 
stable invariant points is the same. 

The difference can be explicitly obtained from any one transformation of 
the class, or by general considerations of analysis situs. 

Evidently the theorem extends to the case when invariant boundaries are 
present. Each boundary is counted as a stable invariant point. 

The dynamical application of these results is the following: 

The difference between the number of directly unstable and stable or inversely 
unstable periodic orbits corresponding to invariant points of T depends only on 
the genus and number of boundaries of the characteristic surface, and on the class 
of the transformation T. 

34. Poincare’s last geometric theorem and a modification. PoincarG 
showed that the existence of an infinite number of periodic orbits in the 
restricted problem of three bodies and other dynamical problems followed at 
once from a certain geometric theorem. The basis of this deduction was the 
fact that a ring-shaped surface of section existed in these cases. The proof 
of the geometric theorem was later given by me (loc. cit.). 

We shall find that Poincare’s theorem leads to the conclusion that there 
exist infinitely many periodic orbits whenever the genus q of the characteristic 
surface is 0. 

To show that there exist infinitely many periodic orbits in the case ? > 0 
I introduce a modification of his theorem below which requires a slight varia¬ 
tion of my proof. 


96 



1917] 


DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


295 


For convenience we shall first state: 

Poincare’s Theorem. Given a ring 0 < a ^ r ^ b in the rO-plane (r,0 
being polar coordinates), and a one-to-one continuous area-preserving transforma¬ 
tion T of the ring into itself, which advances points on r = a and regresses points 
on r = b. Then there will exist at least two points of the ring invariant under T . 
The modification is the following: 

Given an infinite ring 0 < a ^ r in the rO-plane, and a one-to-one continuous 
area-preserving transformation T of the ring into itself, which advances points 
onT = a and regresses all points r ^ R > a by at least an angle 0 1 > 0. Then 
there will exist at least two points of the ring a ^ r < R invariant under T . 
We will indicate briefly the proof of this modified theorem. 

Let us take x = 6, y = r 2 as the rectangular coordinates of a point in the 
xy-plane. The ring then appears as an infinite strip y ^ a 2 . The trans¬ 
formation T of this strip advances points of the boundary y = a 2 to the right, 
and moves points to the left by at least 0, for y ^ R 2 . Moreover T is area- 
preserving in the ary-plane (since we have rdrdO = dxdy), and displaces any 
two points which have the same ordinate and whose abscissas differ by a 
multiple of 2n in the same way. 

Let us combine T with a further transformation T, which effects a trans¬ 
lation of the xy-plane in the direction of the y-axis through a distance 
«(« > 0). The transformation T followed by T, yields an area-preserving 
transformation TT, which shifts the strip y ^ a 2 into the strip y ^ a 2 4- «. 

Suppose if possible that there exists no invariant point of T for y < R 2 . 
There exists then a positive quantity d such that all points a 2 ^ y ^ R 2 are 
displaced at least a distance d by the transformation T. Choose t less than 
d and also less than 0\. 

Consider now the narrow strip a 2 ;£ y ;£ a 2 -f- e. By the transformation 
TT, the lower edge of this strip is carried into the upper edge, and the strip 
is carried into a second strip lying wholly above the first one save along the 
common edge. By a repetition of the transformation TT, the second strip 
goes into a third, and so on. 

By a continuation of this process a series of strips is obtained forming 
consecutive strata. Each of these strata is unaltered by a shift of 2tt to the 
right. This follows from the fact that T and T, is single-valued over the 
infinite ring. 

The images of these strata on the ring are a set of closed strata about the 
ring, all having equal area of course since TT, is an area-preserving trans¬ 
formation in the r0- as well as in the xy-plane. Consequently some one of 
the strata on the infinite ring, say the Arth, must overlap the circle r = > R 

for any choice of Ri. 

In the xy-plane let Q be a point of the upper edge of the Arth stratum for 


97 



296 


G. D. BIRKHOFF 


[April 


which y > R] is a maximum. Let P be the point of y = a? from which Q is 
derived by A:-fold repetition of TT, and let P # , P", • • • , P (t) = Q denote the 
successive images of P under the iteration of TT,. Draw the straight line 
PP' which will obviously lie on the first stratum. The successive images of 
this line PP', P' P"; • • •, P<*-» P<‘> will lie in the successive strata, and will 
have no points in common except that successive arcs have an end-point in 
common. Thus we get a single arc PQ made up of all these lines, which is 
without double points (Fig. 11). 



P 

Fio. 11. 


Consider now a vector LL' , drawn from a point to its image L' under TT ,, 
of which the initial point moves from P to P lk ~ l) along the line PQ . The 
angle which this vector makes with the positive direction of the x-axis at the 
outset may be taken to be a positive acute angle, since the image P' of P lies 
to the right of and above P . When L has varied to its final position P (fc_1> , 
the same angle lies in the second or third quadrant, since P (4) lies to the left 


of P** -1 * by the hypothesis of the theorem. 

Our construction of the successive arcs PP', P' P" » * * * renders it apparent 


that as L moves from P to P*~ l) its image U moves along the same curve 
from P' to Q . Therefore we see at once from the figure that LL' has rotated 
through the least positive angle from the first direction to the second. If L 
is moved further to a position on y = R] the same will be true, for during this 
additional variation the angle given by LL' may be made to remain in the 
second or third quadrants, provided R\ be taken sufficiently large at the 


outset. 

Suppose now that L moves in any manner from a point of y = o 2 to a point 
of y = Pi in the region y ^ a*. The transformation TT, leaves no points 
of this region invariant, so that the point L' will never coincide with L. In 
the initial position for L on y = a, 7 the angle made by LL' lies in the first 
quadrant. In the final position it lies in the second or third quadrant. But 


98 



.DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


297 


1917 ] 


the total variation of angle during the variation of L has been seen to be 
through the least positive angle in a special case. Since any one path of L 
from y = a 2 to y = Hi can be varied continuously into any other, the same 
must be true always. 

Now let t approach zero. As e becomes smaller the vector LL' continues 
to have a definite direction, since no invariant points under TT, are present. 
By a limiting process we infer that for the transformation T the angular 
variation of LL' is through the least positive angle consistent with its initial 
and final directions. It should be observed that for L on y = a 7 the direction 
of LL' is the same as that of the positive x-axis. 

Consider now the inverse transformation T~ l which is of the same type 
as T , although it moves points on y = a? to the left, and points to the right 
for y sufficiently great. By an entirely analogous argument to that given 
above we are led to infer that if a vector LU~ l) with end-point Z» (-1> = T~ l (L) 
has its initial point L varied from a point of y = a 2 to a point of y = Rl ( R 2 
sufficiently large), the total angular variation will be the least negative angle 
consistent with its initial and final positions. 

But thetotalrotation of LL { ~ 1) is precisely the same as that of the oppositely 
directed vector £ ( " 1) L which joins a point L (_1) of y = a 2 to its image L 
under T 

Hence by our earlier result the total angular variation of Z. <_1) L must also 
be the least positive angle consistent with the two positions. Thus we have 
been led to a contradiction, so that there must exist at least one invariant 


point. 

Evidently invariant points can only arise for y 2i R ?. To prove that there 
are at least two invariant points we may adopt the method used by PoincarS. 
The total rotation of a vector drawn from a point in the r0-plane to its 
image is — 2 tt along the inner boundary of the ring when a circuit is made 
which keeps the ring on the left; the corresponding rotation is + 2ir when a 
large circle is positively traversed. In view of the analysis of § 30 we may 
assert that there are precisely as many directly unstable as stable and inversely 
unstable invariant points. It is conceivable that there is but one invariant 
point from a geometric standpoint. But that point would have to be .con¬ 
sidered as two coincident invariant points if we adopt the conventions of § 30. 

As Poincar# pointed out, his geometric theorem leads to the conclusion 
that there are infinitely many periodic orbits in the restricted problem of 
three bodies and similar problems in which there is a ring-shaped surface of 
section. There is the restriction, moreover, that the rotation numbers along 
the two boundaries are not the same. 

If, however, there are more than two boundaries in the case 9 = 0, this 
theorem is not immediately applicable. Imagine a deformation of the surface 


99 



298 


G. D. BIRKHOFF 


[April 


of section to be made which closes all of the boundaries except two. On the 
resulting surface T will appear as a one-to-one continuous transformation 
with an invariant area integral // pdxdy where p is a continuous positive 
function on the surface, which may conceivably become infinite at the point 
images of the boundaries. By a further continuous deformation we may 
still further modify the surface of section so as to deform it into a ring, and 
to make the invariant integral the area integral.* If now we proceed to argue 
as PoinearS did in the case of a ring, we conclude that there exist infinitely 
many periodic orbits. 

If the surface of section has no boundaries the theorem of § 30 will enable 
us to infer the existence of at least two stable periodic orbits, and these (if 
distinct) may be expanded so that the modified surface of section becomes 
of the nature of a ring. If only a single boundary is present the same theorem 
would lead us to infer the existence of a further stable invariant point, and 
by expanding this point, we again obtain a ring. 

Infinitely many periodic orbits exist in the case p — 0, at least if there is a 
surface of section, and if all of the rotation numbers for the invariant points and 
the boundaries are not the same. 

This restriction on the conclusion is a necessary one. The rotation of a 
sphere about a diameter through an angle incommensurable with 2 tt affords 
an example of a one-to-one analytic area-preserving transformation of the 
sphere into itself in which there are only two invariant points for all powers 
of the transformation. This same example renders it extremely doubtful 
whether the periodic orbits are everywhere dense in all cases as Poincar6 
conjectured. 

To prove that there are infinitely many periodic orbits in the case p > 0 
we will use the modified theorem. 

Let us assume that there exists a single stable invariant point, or a boundary, 
which has a rotation number different from zero. We may close each of these 
boundaries by a deformation, as in the case q * 0. 

Take first q - 1, and map the surface of section upon a network of 
rectangles in the plane. By a deformation of one of these rectangles which 
leaves its boundary fixed, we may transform the invariant area integral into 
the area integral. This may be intuitively seen as follows: imagine the area 
in question to be of density p , where ff pdxdy is the invariant area integral. 
Any part of the transformed area will then have the same mass as before. 
By a distortion of the rectangular area which leaves its boundaries fixed, it is 
obviously possible to render this density uniform. The invariant integral 
now appears as the area integral. 

The transformation T yields a one-to-one area-preserving transformation 

• Sec my proof of Poincart’e theorem, loc. cit. 


100 



DYNAMICAL SYSTEMS WITH TWO DEGREES OF FREEDOM 


299 


1917] 


of the entire plane in which we have an invariant point P with a rotation 
number different from zero. 

A radial dilation about P which lengthens radii from r to r' in accordance 
with the formula r' 2 = r 2 -f P 2 will leave the transformation an area-preserving 
one, and at the same time will expand the point P into a circular boundary 
of radius p. In the neighborhood of that boundary points are transformed 
in such a way that the transformation may be thought of as continuous along 
the boundary. The rotation number a for this boundary is the same as 
for P. 

It is clear that distant points of the plane are moved only a limited distance 
by the transformation T . 

Consider now the Arth power of the transformation T and choose k so large 
that ko > 2lir. By compounding T with a negative rotation through / com¬ 
plete revolutions, leaving the circular boundary fixed, we obtain a transforma¬ 
tion T' which has the same effect on the points of the plane as 7' ik) but which 
advances all of the points of the circular boundary through an angle ka — 2hr . 
The same transformation T regresses distant points through an angle nearly 
2 hr. 

By an application of the modified theorem we infer that there arc two 
invariant points of T' , i. e., that there are two points invariant under T (k) 
and so have revolved — l times about the circular boundary under T {k) . 

By letting / range through all possible values we infer the existence of 
infinitely many periodic orbits in the case 7 * 1 . 

For q > 1 we map the characteristic surface upon a network of congruent 
polygons in the non-euclidean plane. By a deformation of one of the poly¬ 
gons, and at the same time a congruent deformation of the other polygons, 
the invariant area integral may be made the area in the non-euclidean plane. 
If the circle is taken as a unit circle with center at the origin in the rtf-plane, 
the non-euclidean area becomes 



rdrdd 
(r 2 - l) 2 


(r, tf being polar coordinates), which is infinite over the unit circle.* 

Hence a dilation of the plane which changes radii in the ratio r' to r where 

-JT 


r 2 ) 2 


will take the circle into the complete plane, in which ordinary areas are in¬ 
variant. 

We may now proceed essentially as in the case 7=1. 

• See Schlesinger, Lineare Diferenlialgleichungen, vol. 2, part 2 (Leipzig, 1898), pp. 96-99. 


101 



300 


Q. D. BIBKHOFF 


If q > 0 and there exists a single boundary or stable invariant point of the 
surface of section for which the rotation number is not zero , then there exist in¬ 
finitely many periodic orbits. 

It is to be noted that the argument above shows incidentally that there are 
infinitely many invariant points of a closed surface S of genus q > 0 under a 
one-to-one area-preserving transformation and its iterations if there exists a 
single stable invariant point of S. 


% 


102 



Reprinted from Bull. Sci. Math., February, 1918, Vol. 42, pp. 41-43. 


SUR LA DEMONSTRATION DIREGTE DU DERNIER TH£ORfcME 
DE HENRI POINCARE PAR M. DANTZIG; 

Par M. G.-D. BIRKHOPF. 


Dans le Bulletin des Sciences mathdmatiq ues (fevrier 1917 ), 
M. Dantzig a essaye une demonstration directe et elementaire du 
theor&me de Poincare. 

J’avais eu avec M. Dantzig une correspandance au sujet de sa 
demonstration. Dans le mois de juin 1916 , il m’a envoyd son 
premier essai. Cela etait presque le m^me que celui qu’il a donne 
dans le Bulletin , sauf qu’il ne considerait pas alors la possibilite 
que la courbe L de deviation angulaire nulle puisse etre rencontree 
en plusieurs points par un rayon. 

Dans ma reponse a M. Dantzig, j’ai cherche k lui montrer 
qu’il n’avait consider que le cas le plus simple pour la 
forme de L. Aussi, comme je lui avais indique, mon ami et 
•collogue M. E.-B. Wilson m’a communique la m£me demons¬ 
tration, il y a quelques ans. Je reproduis ici (avec la permission 
de M.Wilson) quelques lignesd’tine lettre qu’il m’a ecrite dans Ic 
mois de novembre 1912 : 

« Won’t you bother with finding out what ridiculous error 
there is in this simple thing that occurred to me yesterday? » 

« Let 

r ' = /('*» ¥). «#>' = f(r % <?) 

be the transformation (/*, <j> coordonnees polaires) with the con¬ 
ditions 

r « = /( r ,» ¥>. r o = y( r 01 <?), 

®'= tf’C'*!. ?)> <p, ?' = <r('o, <?) < <?. 

We build over the ring the surface z = o '— o = g(r, o)_o so 

as to transfer the wide variations in <p to the vertical where they 
don’t look so mussy. Now the surface is attached to the inner rin<r 

c* 

D. 


103 



- 1 - 

r = r 0 below the plane z = o, to the outer ring r = r, a6ooe that 
plane, and is everywhere continuous. Consider any continuous 
curve 

r=r( 0 , ? = ?( 0 . r(to) = r 0 , r(t t ) = r t 

going from any point of the inner ring to any point of the outer, 

and consider further *(£) = g[r(t), ©(f)] — ?(0» the curve on 
the surface cut out by a cylinder on the curve in the plane. We 
have z(t 0 )<o y z(t t )> o and z(i) continuous. Hence there is 
one point at least on this curve at which <f'— = o, i. e. where 

the motion is inward or outward along the radius from the point 
to the transformed point. Now the intersection of 

z = ? '-o = #(r, ®)-<? 

with 2 = o may be of great complexity containing ovals or ovals 
within ovals in the ring. But as this intersection has to contain 
all of its own limit points and cannot be traversed y any conti¬ 
nuous curve from the inner to the outer circle without being cut 
in at least one point, the intersection must include at least one 
continuous curve circling around the ring, perhaps imbedded in a 
continuous region of intersection for aught I care. Now upon 
this curve (e’est presque la courbe L de M. Dantzig) the shift 
r'—r is continuous and could not be always positive or always 
negative without shrinking said curve or expanding it, contrary to 
the supposed invariance of areas or integrals. Hence there must 
be at least two points for which r'= r as well as f'=f- » 

On voit que la demonstration de M. Dantzig ne difltire de celle 
de M. Wilson que tout a la lin (loc. cit ., p. 5;-58). 

Ensuite M. Dantzig a cherche a me convaincre qu’il existait 
toujours une courbe L ne renconlrant aucun rayon en plus d’un 
point. Dans une seconde lettre j‘ai construit presque la figure de 
la page 5; de son article. II a adinis alors la possibility que je lui 
avais indiquee. 

M. Dantzig n*est alle mainlenant qu’un pas plus loin. 11 a con- 
sidere le cas suivant : i° L est rencontre par lout rayon trois fois 
an plus; 2 ° il u’existe pas daulres branches de la courbe de devia¬ 
tion nulle. 


104 



- 3 — 


J e conclus que M. Dantzig n’a applique sa methode que dans un 
cas extrdmement special. 

Au fond cette methode est tr&s voisine de celle de Poincare, qui 
considerait les courbes L' de variation radiale nolle. Poincare 1 a 
trouvd que sa methode suffisait pour toutes les courbes L' qu’il 
avait imagines. M. Dantzig n’a consider^ que deux possibility 
des plus simples. II me semble tout a fait probable que Poincare 
ait consid^r^ aussi les courbes L. 

En conclusion, je veux dire qu’il n’est pas impossible qu’on 
puisse arriver au but par la consideration ou bien des courbes L' 
ou bien des courbes L. Mais 1’essai de M. Dantzig me semble 
avoir laissd a cdtd toutes les vraies difficulty. Et M. Wilson est 
aussi du meme avis. 


( Kxtrait <lu Bulletin des Sciences mathemaliques, 
a* scric, I. XL1I; ttvricr 1918 .) 


105 



f Reprinted from Soibnoe, 


N. S., Vol. LI., No, 1307, Page* 61-56, January 16, 1930'] 


RECENT ADVANCES IN DYNAMICS’ 


A highly important chapter in theoretical 
dynamics began to unfold with the appear¬ 
ance in 1878 of G. W. Hill’s researches in the 
lunar thoory. 

To understand the new direction taken 
sinoe that date it is necessary to recall the 
main previous development*. In doing this, 
and throughout, wo shall refer freely for 
illustration to the problem of three bodice. 

The concept of a dynamical system did not 
exist prior to Nowton’s time. By use of his 
law of gravitation Newton was able to deal 
with the Earth. Sun, and Moon as cesentially 
throe mutually attracting particles, and by the 
aid of his fluxional calculus he was in a posi¬ 
tion to formulate their law of motion by means 
of differential equations. Here the independ¬ 
ent variable is the time and tho dependent 
variables are the nine coordinate* of the three 
bodies. Such a set of ordinary differential 
equations form the characteristic mathemat¬ 
ical embodiment of a dynamical system, and 
can be constructed without especial difficulty. 

The aim of Newton and his successors was 
to find explicit expressions for the coordinates 
in terms of the time for various dynamical 
systems, just as Newton was able to do in the 
problem of two bodies. Despite notable suo- 
ceesee, the differential equations of the prob¬ 
lem of three bodies and of other analogous 
problems continued to defy “ integration.' 

Notwithstanding the lack of explicit ex- 

» Address of the vice-president and chairman of 
fleetion A—Mathematic* and Astronomy—Ameri¬ 
can Association for the Advancement of 8cienoe, 
8t. Louis, December, 1919. 


preasions for the coordinates, Newton was 
able to treat the lunar theory from a geo¬ 
metrical point of view. Euler, Laplace, and 
others invented more precise analytical meth¬ 
ods based upon series. In both cases the 
bodies which are disturbing the motion of the 
Moon are assumed first to move in certain 
periodic orbits, and the perturbations of the 
Moon aro assumed to be the samo as if the 
other bodiee did move in such hypothetical 
orbits. The principle of successive approxi¬ 
mations characterizes these methods. 

The chief other advance mado was baaed on 
the following principle: if a function is a 
maximum or minimum when expressed in 
terms of one set of variable* it is also a 
maximum or minimum for any other set; 
hence, if the differential equations of dy¬ 
namics can bo looked upon as tho equations 
for a maximum or minimum problem, this 
property will persist whatever variable* bo 
employed. This principle, developed mainly 
by Lagrange, W. R. Hamilton, and Jaoobi, 
enables one to make the successive change* 
of variables required in the method of suc¬ 
cessive approximations by merely doing so in 
a single function. 

Here too the results are chiefly of formal 
and computational importance. 

The last great figure of this period is Jacobi. 
His “ Vorlesungen fiber Dynamik ” published 
in 1866 represents a highwater mark of 
achievement in this direction. 

Nearly all fields of mathematics progress 
from a purely formal preliminary phase to a 
second phase in which rigorous and qualita- 


106 



2 


SCIENCE 


tive methods dominate- From this more ad¬ 
vanced point of view, inaugurated in the 
domain of functions of a complex variable by 
Riemann, we may formulate the aim of dy¬ 
namics as follows: to characterize completely 
the totality of motions of dynamical systems 
by their qualitative properties. 

In Poincare's celebrated paper on the prob¬ 
lem of three bodies, published in 1889, where 
he develops much that is latent in Hill’s work, 
Poincar6 proceeds to a treatment of the sub¬ 
ject from essentially this qualitative point of 
view. 

A first notion demanding reconsideration 
was that of integrability, which had played so 
great a part in earlier work. In 1887 Bruns 
had proved that there were no further al¬ 
gebraic integrals in the problem of three bod¬ 
ies. Poincar6 showed that in the so-called 
restricted problem there were no further in¬ 
tegrals existing for all values of a certain 
parameter and in the vicinity of a particular 
periodic orbit Later (1900) Levi-Civita has 
pointed out that thero are further integrals 
of a similar type in the vicinity of part of 
any orbit 

Thus it has become clear that the question 
os to whether a given dynamical problem is 
integTable or not depends on the kind of 
definition adopted. However, the most nat¬ 
ural definitions have reference to the vicinity 
of a particular periodic motion. The intro¬ 
duction of a parameter by Poincarl is to be 
regarded as irrelevant to the essence of the 
matter. 

From the standpoint of pure mathematics, 
a just estimate of the results found in in¬ 
tegrate problems may be obtained by refer¬ 
ence to the problem of two bodies, or, more 
simply still, of tho spherical pendulum. The 
integration by means of olliptic functions 
shows that tho pendulum bob rotates about tho 
vortical axis of the sphere through a certain 
anglo in swinging between successive highest 
and lowest points. But the form of the differ¬ 
ential equation renders this principal qualita¬ 
tive result self-evident, while tho most ele¬ 
mentary existence theorems for differential 
equations assure one of the possibility of ex¬ 
plicit computation. Hence the eesential im¬ 


portance of carrying out the explicit integra¬ 
tion lies in its advantages for purposes of 
computation. 

The series used in the calculations of the 
lunar theory and other similar theories were 
given their proper setting by PoincarA He 
showed that they were in general divergent, 
but were suitable for calculation because they 
represented the dynamical coordinates in an 
asymptotic sense. 

The fact that the first ordor perturbations 
of the axes in the lunar theory can be 
formally represented by such trigonometric 
series had led astronomers to believe that the 
perturbations remained small for all time. 
But the fact of divergence made tho argument 
for stability inconclusive. 

It is easy to see that this question of 
stability, largely unsolved even to-day, is of 
fundamental importance from tho point of 
view formulated above. For, in a broad 
sense, the question is that of determining the 
general character of the limitations upon the 
possible variations of tho coordinates in dy¬ 
namical problems. 

We wish to mention briefly four important 
steps in advanco in this direction. 

The first is duo to Hill who showed in his 
psper that, in the restricted problem of three 
bodies, with constants so chosen as to give the 
best approximation for tho lunar theory, tho 
Moon remains within a certain region about 
the Earth, not extending to tho Sun. In fact 
here thero is an integral yielding tho squared 
relative velocity as a function of position, and 
the velocity is imaginary outsido of this 
region. 

In his turn, Poincaro showed that stability 
exists in another sense, namoly for arbitrary 
values of the coordinates and velocities thero 
exist nearby possible orbits of tho Moon 
which take on infinitely often approximately 
the same set of values. His reasoning is ex¬ 
tremely simple, and is founded on a hydro- 
dynamic interpretation in which the orbits 
appear os the stream lines of a three-dimen¬ 
sional incompressible fluid of finite volume in 
steady motion. A moving molecule of such a 
fluid must indefinitely often partially re¬ 
occupy its original position with indefinite 


107 



SCIENCE 


3 


lapse of time, and this fact yields the stated 
conclusion. 

In 1901 under the same conditions Levi- 
Civita proved that, if the mean motions of 
tho Sun ond Moon about the Earth are com¬ 
mensurable, instability exists in the following 
sense: orbits as near as desired to the funda¬ 
mental periodic lunar orbit will vary from 
that periodic orbit by an assignable amount 
after sufficient lapse of time. This result, 
which is to be anticipated from the physical 
point of view, makes it highly probable that 
instability exists in the incommensurable case 
also. 

These three results refer to the restricted 
problem of three bodies. 

Finally there is Sundman’s remarkable 
work on the unrestricted problem contained 
in his papers of 1912 and of earlier date. 
Lagrange had proved that if a certain energy 
constant is negative, the sum of the mutual 
distances of the three bodios becomes infinite. 
Sundman showed that, even if this constant 
is positive, tho sum of the three mutual dis¬ 
tances always exceeds a definite positive quan¬ 
tity, at least if the motion is not essentially 
in a single plane. Thus he incidentally veri¬ 
fied a conjecture of Wei era trass that the 
three bodies can never collido simultaneously. 
These ond other results seem to me to render 
it probable that in general tho sum of the 
three distances increases indefinitely. Thus, 
if this conjecture holds, in that approxima¬ 
tion whore tho Earth, Sun and Moon are 
taken as three particles, the Earth and Moon 
remain near each other but rccedo from the 
Sun indefinitely. The situation is worthy of 
the attention of those interested in astronomy 
and in atomic physics. 

As we have formulated the concept of 
stability, it is essentially that of a permanent 
inequality restricting the coordinates. We 
may call a dynamical system transitive in a 
domain under consideration if motions can be 
found arbitrarily near any one state of motion 
of tho domain at a particular time which pass 
later arbitrarily near any other given state. 
In such a domain there is instability. If we 
employ the hydrodynamic interpretation used 
above, the molecule of fluid will diffuse 


throughout the corresponding volume in the 
transitive case, and will diffuse only partially 
or not at all in the intransitive case. The 
geodesics on surfaces of negativo curvature, 
treated by Hadamard in 1898, furnish a 
simple illustration of a transitive system, 
while the integrable problem of two bodies 
yields an intransitive system. Probably only 
under very special conditions does intransi¬ 
tivity arise. 

It is an outstanding problem of dynamics 
to determine the character of the domains 
within which a given dynamical system is 
transitive. 

A loss difficult subject than that of stability 
is presented by the singularities of tho 
motions such as arise in tho problem of throo 
bodies at collision. The work of Levi-Civita 
and Sundman especially has shown that the 
singularities cau frequently bo eliminated by 
means of appropriate changes of variables. 
In consequence the coordinates of dynamical 
systems admit of simple analytic represents* 
tion for all values of the timo. In particular 
Sundman has proved that the coordinates and 
the time in the problem of three bodies can 
be expressed in terms of permanently con¬ 
vergent power series, and thus he has “ solved " 
the problem of three bodice in tho highly arti¬ 
ficial sense proposed by Painlev6 in 1897. 
Unfortunately these series are valueless eithor 
as a moons of obtaining qualitative informa¬ 
tion or as a basis for numerical computation, 
and thus are not of particular importance. 

From early times tho mind of man has 
persistently endeavored to characterize the 
properties of the motions of tho stars by 
means of periodicities. It seems doubtful 
whether any other mode of satisfactory de¬ 
scription is possible. The intuitive basis for 
this is easily stated: any motion of a dy¬ 
namical system must tend with lapse of time 
towards a characteristic cyclic mode of be¬ 
havior. 

Thus, in characterizing the motions of a 
dynamical system, those of periodic type aro 
of central importance and simplicity. Much 
recent work has dealt with the existence of 
periodic motions, mainly for dynamical sys¬ 
tems with two degrees of freedom. 



4 


SCIENCE 


An early method of attack was that of 
analytical continuation, due to HU1 and Poin¬ 
care. A periodic motion maintains its ident¬ 
ity under continuous variation of a parameter 
in the dynamical problem, and may be fol¬ 
lowed through the resultant changes. O. 
Darwin, F. R. Moulton and others have ap¬ 
plied this method to the restricted problem 
of three bodies. Symmetrical motions can be 
treated frequently by particularly simple 
methods. Hill made use of this fact in his 
work. 

Another method is based on the geodesic in¬ 
terpretation of dynamical problems. This has 
been developed by Hadamard, Poincart, Whit- 
takor, myself, and others. The closed geodesies 
correspond to the periodic motions, and the 
fact that certain closed geodesics of minimum 
length must exist forms the basis of the argu¬ 
ment in many cases. As an example of an¬ 
other type, take any surface with the con¬ 
nectivity of a sphere and imagine to lie in 
it a string of the minimum length which can 
be slipped over the surface. Clearly in being 
slipped over the surface there will be an 
intermediate position in which the string will 
bo taut and will coincide with a closed 
goodeeic. 

Finally there is a less immediate method of 
attack which Poincarfi introduced in 1812, 
and which I havo tried to extend. By it the 
existence of periodic motions is made to de¬ 
pend on the existence of invariant points of 
certain continna under one-to-one continuous 
transformation. The successful application 
of this method involves a preliminary knowl- 
odgo of certain of the simpler periodic 
motions. 

Periodio motions fall into two classes which 
we may call hyperbolic and elliptic. In the 
hyperbolic case analytic families of nearby 
motions asymptotic to the given periodic 
motion in either sense exist, while all othev 
nearby motions approach and then recede 
from it with the passing of time. In the 
elliptic caso the motion is formally stable, 
but the phenomenon of asymptotic families 
not of analytic type arises unless the motion 
is stable in the sense of Levi-Civita. 

In a very deep sense the periodic motions 


bear the same kind of relation to tho totality 
of motions that repeating doubly infinite 
sequences of integers 1 to 9 such as 
. . . 2323 . . . 

do to the totality of such sequences. 

In trying to deal with the totality of 
possible types motion it seems desirable to 
generalize the concept of periodic motion to 
recurrent motion as follows: any motion is 
recurrent if, during any interval of time in 
the past or future of sufficiently long dur¬ 
ation T, it comes arbitrarily near to all of its 
states of motion. With this definition I hove 
proved that every motion is either recurrent 
or approaches with uniform frequency arbi¬ 
trarily near a set of recurrent motions. 

The recurrent motions correspond to those 
double sequences specified above in which every 
finite sequence which is present at all occurs 
at least once in every set of N successive in¬ 
tegers of the sequence. 

In any domain of transitivity tho two ex¬ 
treme types of motion are the recurrent 
motions on the one hand and tho motions 
which pass arbitrarily near every state of 
motion in the domain on the other. Both 
types necessarily exist, as well as other inter¬ 
mediate types. 

The precise nature of such recurrent mo¬ 
tions has yet to be determined, but Dr. H. C. 
M. Morse in his 1918 dissertation at Harvard 
has shown that there exists non-periodic 
recurrent motions of entirely now typo in 
simple dynamical problems. 

Such are a few of the steps in advance that 
theoretical dynamics has taken in recent 
years. I wish in conclusion to illustrate by 
a very simple example the type of poworful 
and general geometric method of attack first 
used by PoincarA 

Consider a particle P of given mass in 
rectilinear motion through a medium and 
in a field of force such that the force act¬ 
ing upon P is a function of its displace¬ 
ment and velocity. In order to achieve sim¬ 
plicity I will assume further that the law of 
force is of such a nature that, whatever be the 
initial conditions, the particle P will pass 
through a fixed point O infinitely often. 

If P passes O with velocity v it passes 0 


109 



SCIENCE 


6 


at a first later time with a velocity v t of 
opposite sign. Wo have then a continuous 
one-to-one functional relation v t = f (v). I* 
v is taken as a one-dimensional coordinate in 
a line, then the effect of the transformation 
v t =f (a) is a species of qualitative “reflec¬ 
tion ” of the line about the point O. 

If this “reflection” is repeated the result¬ 
ant operation gives the velocity of P at the 
second passage of O, and so on. But the 
most elementary considerations show that 
either (1) the reflection thus repeated brings 
each point to its initial position, or (2) the 
line is broken up into an infinite set of pairs 
of intervals, one on each side of O. which are 
reflected into themselves, or (3) there is a 
finite set of such pairs of intervals, or (4) every 
point tends toward 0 (or away from it) under 
the double reflection. 

Hence there are four corresponding types of 
systems that may arise. Either (1) e*«ry 
motion is periodic and O is a position of 
equilibrium, or (2) there is an infinite 
discrete set of periodic motions of increas¬ 
ing velocity and amplitude (counting the 
equilibrium position at 0 as the first) such 
that, in any other motion. P tends toward 
one of thoee periodic motions as time in¬ 
creases and toward an adjacent periodic mo¬ 
tion in past time, or (3) there is a finite 
set of periodic motions of similar type such 
that, in any other motion, P behavee as just 
stated, if there be added a last periodic mo¬ 


tion with “infinite velocity and amplitude” 
as a matter of convention, or (4) in every mo¬ 
tion P oscillates with diminishing velocity and 
amplitude about 0 as time changes in one 
sense and with ever increasing velocity and 
amplitude as time change* in the opposite 
sense. 

Here we have used the obvious fact that 
there is a one-to-one correspondence between 
velocity at O and maximum amplitude in the 
immediately following quarter swing. 

This example illustrates the central role of 
periodic motions in dynamical problems. It is 
also easy to see in this particular example that 
the totality of motions has been completely 
characterized by these qualitative properties 
in a certain sense which we shall not attempt 
to elaborate. 

What is the place of tho developments re¬ 
viewed above in theoretical dynamics? 

The recent advances supplement in an im¬ 
portant way the more physical, formal, and 
computational aspects of the science by pro¬ 
viding a rigorous and qualitative background. 

To deny a position of great importance to 
these results, because of a lack of emphasis 
upon the older aspects of the scienco would be 
as illogical as to deny the importance of the 
concept of the continuous number system 
merely because of the fact that in computa¬ 
tion attention is confined to rational numbers. 

Gkoboi D. Birkhofv 


110 



SURFACE TRANSFORMATIONS AND THEIR DYNAMICAL 

APPLICATIONS. 


ItY 

GEORGE D. BIRKHOI'F 

of Camoridok, Mass., U. S. A. 


A state of motion in a dynamical system with two degrees of freedom 
depends on two space and two velocity coordinates, and thus may be represented 
by means of a point in space of four dimensions. When only those motions are 
considered which correspond to a given value of the energy constant, the points 
lie in a certain three-dimensional manifold. The motions are given as curves in 
this manifold. One such curve passes through each point. 

Imagine these curves to be cut by a h surface lying in the manifold. As the 
time increases, a moving point of the manifold describes a half-curve and meets 
the surface in successive points, P, F,.... In this manner n particular trans¬ 
formation of the surface into itself — namely that which takes any point P into 
the unique corresponding point P' — is set up. 

This fundamental reduction of the dynamical problem to a transformation 
problem was first effected by Poincar£ and later, more generally, by myself . 1 
In order to take further advantage of it I consider such transformations at length 
in the following paper, which appears here by the kind invitation of Professors 
Mittao-Leffler and NOrlund. The dynamical applications are made briefly 
in conclusion. These bear on the difficult questions of integrability, stability, 
and the classification and interrelation of the various types of motions. 

Chapter I. Formal Theory of Invariant Points. 

§ i. Hypotheses. 

For the prosent we shall confine attention to the consideration of a one- 
to-one. direct, analytic transformation T in the vicinity of an invariant point of 

1 Dynamical systems with tico degree* of freedom. Transactions of the American Mathematical 
Society, vol. 18, >917. 

Atla malhtmaliea. «3. Imprint ie 17 m«r» 1 


111 




2 


George D. Birklioff. 

the surface S undergoing transformation. Hence, if u, v be properly taken 
coordinates with the invariant point at u = v — o, the transformation may be 
written 

u, =aa + hr + 

v, = cu + dv + 


where the right-hand members are real power series in (i. e. with real 

coefficients), where t>, are the coordinates of the transformed point, and where 


( 2 ) ad —bc> o. 

More generally, the notation (u*. v*) or P* (k — o, ± x, ± 2,.. •) will stand 
for the point obtained by applying the fcth iterate (power) of T to (u, v) or P. 

Furthermore it will be assumed that there exists a real analytic function 
Q(u, v), not zero for u — v = o, such that the double integral 


/< 

ij 


Q(u, v)du dv 


has the same value when extended over any region as over its image under T. 
Following a dynamical analogy such a transformation will be called conservative. 
Also Q will be termed a quasi-invariant function of T. 

An explicit form for the condition that a quaai-invariant function must 
satisfy is well-known 1 and may be readily derived. If the double integral be 
expressed in terms of the new variables «i a , t>„ it takes the form 




where the integration extends over the image of the given region under T. 
Since the given region is arbitrary, and since by hypothesis the last written 

integral has the same value os J | <?(»,, v,)du, dv, taken over tbo same region. 

we infer that the two integrands are equal. But the Jacobian of u. v as to 
u,, v, is the reciprocal of the Jacobian of u,. v, ns to u. v. Hence we obtain 

r,? Ul dv, dv, UuA 

(3) <?(u.v)-0(“..».)la- „„ ou ai j' 

• Cf. E. Goorsiat. Snr la tvaniformalioni pamthuBa qui comment la tohmu. Bulletin dee 
Sciences Maihcmatiquet, vol. 52 . «9‘7- 


112 



Surface transformations and their dynamical applications. 


3 

Conversely, if Q(u,v) is a real analytic function, not zero for u=-v = o, 
and if (3) 15 true, it follows at once that Q is a quasi-invariant function. 

If there exists a second quasi-invariant function Q' not a constant multiple 

of Q, it *s clear that the ratio ^ is an analytic invariant function of T , not 

zero for u — v = o. Moreover, if any quasi-invariant function be multiplied by 
such an invariant function, the product is clearly a quasi-invariant function. 

When a conservative transformation T has an analytic invariant function 
(not a constant), the transformation will be said to be integrable . 1 

A transformation T remains conservative under a change of variables, say 
from u, v to u, if. The quasi-invariant function Q is thereby modified to a 
function Q obtained by multiplying Q by the Jacobian of u, v as to u, if. 


§ 2. Preliminary Classification of Invariant Points. 

We first make an evident and well-known preliminary classification of in¬ 
variant points which is wholly based on the nature of the linear terms in the 
power series for u,, v,. Under real linear chango of variables these first degree 
terms are transformed among themselves without reference to terms of higher 
degree. Consequently the theory of linear transformations applies to these terms. 
According to this theory the classification depends largely upon the nature of 
tho roots of the quadratic equation in q, 

— (a + d) q + ad — be — o. 

In the case at hand this equation is a reciprocal quadratic equation, i. e. 
(4) ad — be — 1. 

For, if u — t; — o, we have Q — Q, * o and also 


fJu *h dv * , 

du a ‘ dv b ’ JiT ~ C ' 



Thus from (3) the stated equation (4) follows. The roots of this reciprocal 

equation will be designated as o and -• 

9 

' It should be observed that the definition refers to the vicinity of an invariant point. 


113 



4 George D. Birkhoff. 

There are the following three cases to consider. First, q may be real with 
a numerical value not unity; T can then be taken in the normal form 


ii, — ?u+ 2 t r mn um ' r > 

m + n-t 

v l - X -v+2'P mn u~*'. 

e m+n-t 


We subdivide this caso according as q is positive (case 1') or negative (case I"). 
Secondly, e may be complex and so of modulus x. With this case we group 
that case ± i in which the two elementary divisors are distinct. Here T 
may be taken in the normal form 

I u, —u cos 0 — v 8in0 + 2 

| m + n-S 


f, - U Sin 0 + V COS 0 + 2 m* “ mv "- 


It is convenient to subdivide case 


II into the irrational case II' when ~ is 


irrational, and the rational cases II" when 0 — o, and II"' when 2/r “ ^ with 

not an integer. Case II" yields the case ?-x; and II"', the case*--!. 
Thirdly, we have that case in which the two elementary divisors are not distinct- 
here T may be taken in the normal form 


III. 


u, - ±U +2 U m v, ( q - ± X), 

m+n-t 

V, — ± V + du 4 2 Vmn umv ”» (dr*0). 


We subdivide this case according as q — x (case III') or ? —— i (case III'). 

If only linear terms are present in u,, we obtain the linear transforma¬ 
tions: 

I. u, — qu, (?H ± x), 

II. u, “ u cos 0 — v sin 0, v, = u sin 0+ v cos 0, 

III. « a -±«, »,-±» + «f*. (d^o). 


114 


•q i-a 



Surface transformations and their dynamical applications. 


5 


These may be regarded as furnishing a first approximation to the corresponding 
general types. According to our definition all three linear transformations are 
conservative with Q~ i a quasi-invariant function since areas are left invariant. 
Furthermore these cases are integrable with invariant functions uv, u ! + r*, u* 
respectively. 

In the first case a point P will move on a hyperbola uv - const, upon 
successive application of T or T~ x (u, v being taken as rectangular coordinates); 
in the third case P will move along a pair of parallel lines u * — const. Unless 
the point P lies on tho degenerate hyperbola uv = o in the first cuse, or on the 
pair of coincident straight lines u* - o in the third, P will recede to infinity 
upon successive application of T or 7 \_,. When P lies on the degenerate hyper¬ 
bola in the first case, it will approach the invariant point (o, o) upon successive 
application of T or else of T- j, and recede to infinity upon application of the 
inverse transformation. In the third case all points of the line u - o are invariant 
or are reflected into points of the same line on the other side of (o, o), according 
as the + or — sign is tfsed. 

On the other hand, in the second case the transformation is a rotation 
about (o. o) through an angle 0 , and every point P remains at a fixed distance 
from (o, o) upon successive application of T or T_ x . 

Tho essence of the distinction here existing is brought out clearly by means 
of the following fundamental definition: if a neighborhood of an invariant point 
can be so taken that points arbitrarily near the invariant point leave this neigh¬ 
borhood upon successive application of T (or of T. x ), the invariant point is 
unstable; in the contrary case the invariant point is stable . 1 

Thus the linear transformations I, III are unstable in this sense, while 
those of type II are stable. 


§ 3- An auxiliary Lemma. 

Before proceeding to the consideration of formal series for u*, v k (* - 0| 
± i» ± 2,...), we will establish the following obvious but useful lemma: 

Lemma. The linear difference equation of the first order in y{k), 

y(k + x) — oy(k) *= cX k k> 1 , 

ol «',«! T ‘ L,V,C,TITA ' Sop,a ° lnini ***** di instabilita. Annali di Mattmatica, S Br . Ill 


115 



6 


George D. Hirkboff. 


Iff, c, X real, and /« a positive integer or zero) admits a solution 

X k (real polynomial in k of degree fi) 


if Xf*u, and otherwise a solution 

X k (real polynomial in k of degree /« + x). 

Suppose first that X *0. Let us make the substitution y — X k io, when the 
difference equation takes the form 


If we write 


w(k + 1) —— 


- to* 0 '*" + +... + uA->, 


we find that 10 will be a solution if the following conditions are satisfied 




On account of the assumption made, we see at once that these equations determine 
real quantities ut oi , oJW in succession, and lead to a solution of the kind 

specified. 

If A — o a slightly modified argument applies. Here we write y — X k w as 
before, and then 

tv — b'+ l + t^ l) b‘ + • • • + 


The conditions on the coefficients take the form 

(ii + x)u/°>— f, 
•• 

X • 2 


ic<0) + ur'i) —o. 

These equations determine real quantities tv (u) in succession but 

leave 1 undetermined, although it is to be taken real. 


116 





Surface transformations and their dynamical applications. 


7 


§ 4 - Formal Series for u k ,v k . Case I. 

By iteration one can obtain convergent series for t/*, v k in terms of u, v. 
In case I the linear terms of these series are evidently tf'u, e~*v respectively. 
This fact suggests that higher degree terms may be similarly given an explicit 
form in k, and we shall show this to be the fact. 

If u If v, ore real series of the form I with g>o (case /'), u k , v k may be 
represented far all integral values of k in the form 


Vk. 


( 


- e* u + 2 v£ m u m 
«♦*-» 

Vk - v+2 #2*, u” v n . 




where *>» , <*». are real polynomials in /, r “, k o/ degree at most m + nin these 
variables. 

Let us consider first the quadratic terms in the series for v t , v k . 

If in u t , vs wo replace u, v by u,, v, respectively, we obtain by 

definition. By comparison of coefficients in I'* above, this leads to the equations 

~ <♦» -**?„ + .,'**•> - + t-'vti. 

-*■**.. + «» *«. *ft"* - ?-* r., + W - r* + r* • 


The first three of these equations are obtained by comparing the coefficients of 

**■ *’ rcs P cctive| y '■> “*+■(«. *■) and u*(u,, »,); the second three are found by 
a like comparison of »»♦,(«,») and v.fu,.. 

By considering with atn-i as undetermined functions of the 
index k, it is clear that these six equations constitute six difference equations 
of the type treated in the lemma of § 3. 

Moreover these equation, suffice to determine these six functions fully for 
a I integral values of t it their value is known for any particular k. In the case 
at hand we have of course ffi.-#•»-. for all m and n. since u.-u.v,-«. 

According to the lemma we can find explicit solutions of these difference 
equations of a very simple type, namely constant multiples of g* for the first 
three equations, and of for the second three equations. Also the six reduced 


117 



g George D. Birkhoff. 

homogeneous equations obtained by removing the first term on the right in the 
six equations admit the following respective particular solutions: 

x, e - **; x. r 2k - 

By adding real constant multiples of these solutions to the respective 
solutions of the non-homogeneous equations, we find a new set of particular 
solutions vanishing for k = o as desired. 

In this way we obtain the explicit values of •/£*„. '!'£ n for mtn-2: 


(5) 




vK 

iM m «/'..(?“*— 9 th ) 



w- - ^!i_ e * • 

p-l — 1 


Q — Q-' 


We proceed to show that explicit expressions for ff£ m , of the type 
stated exist also for m + n —3, m + n—4,... in succession. 

To begin with, we write the equations obtained by a comparison of the 
coefficients of u"*v" in u k¥X (u, t>), **(«,,»,) and va+i(«, t>), »*(«,,v,) >n fche 
respective abbreviated forms: 


¥iV' - e 1 +r- ^ . 
C" ° r k * mm + e"~" ^Sn + ?«-• 


The expansions of and in and respectively yield 

the first terms on the right in these equations. The second terms arise from the 
expansion of pj*,i»?and in the same functions. The last terms arise 

from the expansion of and respectively, with a + fl<m + n ; 

thus and G*,, are linear and homogeneous in pfj, respectively, with 

real coefficients, polynomial in p, f 1 , *?,.*, */V*0 l + *<« + 0)- 

Suppose now that we take m + n — 3 and assume that the explicit expres¬ 
sions for pgj(a + 0 - 2) are substituted in P mm , Q mf% . The above equations become 
linear difference equations in ygj,, Furthermore, it is clear that these 

equations, together with the fact that ^, fg u vanish, detgjmine these variables 
completely for all integral values of k. 

By a similar process to that employed in the case m + n = 2 we may arrive 
now at explicit expressions for in the case m + n —3. 


118 



Surface transformations and tlieir dynamical applications. 


9 


In this new case we have a non-homogeneous part composed of more than 
one term. But each term is of the form cX k k ** occurring on the right-band side 
of the equation of the lemma (§ 3), since the non-homogeneous part is a polynomial 
in p*, p”* of degree at most 2. 

If we add together the various particular solutions corresponding to each 
of these terms, as given by the lemma, we obtain a solution of each difference 
equation for m + n = 3 in the form of a real polynomial in p*, p”*, k, of at most 
the third degree in these variables. 

The corresponding homogeneous reduced equation has a solution 
If a suitable real constant multiple of this solution is added to the above parti¬ 
cular solution of the non-homogeneous equation, a new particular solution is 
obtained which vanishes for i — o. Solutions of this type are real polynomials 
in p*. p“\ k of degree at most 3 in these variables, and form the desired 
expressions. 

Proceeding indefinitely in this way we establish the truth of the italicized 
statement for m + n — 3, m + n — 4,.... 

It is obvious that the coefficients in the polynomials are them¬ 

selves real polynomials in the coefficients of the series u lt v t , save" for divisors 
of the form p" — p* where a and (t are unequal integers. 

In the later discussion it is convenient to bring back the case I" (?<o) to 
the case I' by means of the following remark: 

If a, are real senes of the form / with p<o (case I"), then are of 

the form /' treated above. 


§ 5- Formal series for u k , v k . Case II. 


> T ext let us consider series of type II in the general case when 0 is incom¬ 
mensurable with 2*r. 

// u „ v, are real eerie, a/ the form II vrith ± irrational (case II'). v* may 
be represented for all integral values of k in the form 


II!*. 


= U cos ko — v sin ko + 2 tt" V" , 

2 

v k — u sin ko + v cos k0+ Y V'**’ u n v n 


where J® ,1® are real polynomial, in cos kO, sin III. * o/ *j,« ai mos( ; 
these variables. 

A<l>. rnalhrmaliea. 43. Iiupria* I. |; 


119 





10 


George D. Birkhoff. 


Let us introduce new variables u, v, namely 

u — u + V—i v, v — u— V— x v- 

The equations II give series for », in terms of u, v, which are of the form I 
with q — e^—i 6 . 

Now the lemma of § 3 can evidently be extended to the case when a, c, K 
are complex constants. Here of course the polynomial factors in the solutions 
are no longer real in general. Hence the same formal treatment of v k , t)* is 
posssible as was made in case V for u k , v k ; in fact for the case at hand none of 
the divisors Q a — are o so that the solutions are precisely of the same form. 
Thus Uk.-Vk can be expressed as power series in u, v with coefficients 

of u m v n respectively, polynomial in p*. k of degree not more than m + n. 

Recalling the simple relation between ti, v and u, v, and utilizing the 
trigonometric form of ?*, we arrive at series 1/*, v k of the desired type, save 
that the reality of the polynomials yJJJ,, is not established. 

Although an inspection of the actual formulas employed would establish this 
reality, it suffices to note that, since u k , v k are real power series, the real parts 
of constitute real polynomials of the type required. 

In the rational case II, 0 — o, series of type II are also of type I with 
Consequently the method of § 4 leads at once to the conclusion: 

If u,, v t are real series of the form II with 0 — o (case //"), u k , v k may be 
represented for all integral values of k in the form 


ir*. 


«*-«+2 

m 

V k — V + ^ 'VHn tt "* 
IN+M-S 


where are real polynomials in k of degree at most m+ n — i. 1 

The rational case 0 ho can he brought back to the case 0 —o: 

If ii,, v, are real series of the form II with ^ | (case II'"), then u g , v q 

are of the form II". 

There are series similar to II* in the general rational case, but we do not 
need to use them. 


* This fact has bean noted by C. L. Boston, Bulletin of the Ameiican Mathematical Society, 
vol. 23, 1916, p. 73. See also A. A. Bissrrr. A cate of iteration in several variables. Annals of 
Mathematics, vol. 17, 1915—1916. 


120 




Surface transformations and tbeir dynamical applications. 


11 


§ 6. Formal series for u k , v k . Case III. 

Finally we have to consider case III: 

// u,, v, are real aeries of Ihe form III with (case III '). u k , v k may be 

represented for all integral values of k in the form 


III'*. 


Vk — U + ^ rltn >‘ m V 1 , 
m +*—2 

m 

v k — v + kdii +£ V'2 , ll u"* , 


where «« are reaf polynomials in k of degree at most tm+n — x. 

We propose to deal with this case by reducing it to the case II" as follows 
Writo 

u = uv, v — v, 


and let us make this change of variables in the given transformation. We obtnin 


u,v, - uv+2r mm u~p*+", 

w + d-t 


0» 

0| — V + duV + 2 


Now the right-hand member of each of these equations contain, e as a factor. 

Hence, dmd.ng the f.rst equation, member for member, by the second, we find 
the equivalent equations 

«.-u+2r„,u-S", 

"»♦* —2 





m*n-? 


which is formally of the type II". 
for all integral values of k 


Hence by our result in § 5 we may write 


121 



12 


George D. Birkboff. 


uk —«+ 2 u " * 

m+n-2 

V* — V + 2 ^mn **“ ®V 
•n+n-2 


where •p** 1 , arc real polynomials in k of degree at most m + n — i. 

Multiplying these two equations together, member for member, we get 


Vk Vk— w v + ^ 


~U) u m 

n% ft 


where jig*, is a real polynomial in k of degree at most m+n —2^ Compare this 
. equation with that for ti* as a power scries in u, v, and so in u, v. The two 
series must be identical so that the exponent of v must be at least as great as 
that of u in every term. Hence xJJJ, vanishes identically for n < m. Consequently, 
if we write 

*«.-*!!!—• 


we have u* expressed in the stated form. 

Likewise, if we compare the series for v k with that for r*, we are led to 

see that vanishes identically for n < m and to write 

♦a- 


so that Vk is of the stated form. 

It may be observed that all of the series employed converge for u, v suf¬ 
ficiently small in absolute value. This fact justifies the method of formal com¬ 
parison employed. 

The case III with — x is taken care of by the following remark: 

If u,, v, are real series of the form III with — x (case III"), v, are of 
the form II". 


§ 7. Uniqueness of series for u*. r*. 

Tho following is easily proved: 

Lemma. Unless q is a root of unity, a polynomial io ?*, ?“*, k, q 1 , e -/ , /,... 
cannot vanish for all integral values of k, l .without vanishing identically. 


122 


Surface transformations and tlieir dynamical applications. 


13 


If possible, suppose that the lemma is not true when there is a single 
variable k, i. e. suppose that there exists a polynomial in {f, Q~ k , k which 
vanishes for all integral values of k without vanishing identically, although q is 
not a root of unity. 

In the first place we cannot have |g|>i. For in this case divide the 
hypothetical polynomial by the highest power of which appears explicitly. 
Let k take on larger and larger integral values. All of the terms of the modified 
polynomial tend to zero save the term formed by the coefficient of this highest 
power, inasmuch as q k becomes infinite more rapidly than any power of k. This 
coefficient is itself a polynomial in k which is not identically o. Hence it cannot 
approach.o as k becomes positively infinite. But, since the hypothetical poly¬ 
nomial vanishes for all integral k, this is absurd. 

The possibility |g|<i is disposed of similarly by dividing through by tho 
highest power of g-* which appears. 

Hence we have |p| — x and may write g — 0 where 0 is real. Here we 

fix upon the coefficient of the highest power of k which appears in the hypo¬ 
thetical polynomial. An argument like that made above shows that this coef¬ 
ficient must approach o as k becomos infinite through integral values. However, 

this coefficient is a polynomial in cos kO, sin k0\ and ~ is irrational since p is 

2 Tt v 

not a root of unity. Hence kO can be made to differ from an integral multiple 
of 27t by nearly any assigned quantity / for large integral k. Thus tho coefficient 
polynomial must vanish when kO is replaced by the arbitrary real variable t. 
This is impossible. 

A similar proof disposes of the case when two or more variables enter. 

An application of the lemma shows at once: 

The polynomials V-*’. of §§ 4 . 5 . 6 are unique. 

In fact it is clear that the difference of two such polynomials with the 
same subscripts m. n vanishes for all integral *. But these polynomials are of 
the type dealt with in the lemma, and must therefore coincide. 


§ 8. The formal group for T . 

T , h Vr U8 "“ egraI P ° WCrS 0f the ‘""’formation T combine according to 
the rule T k T, - T t +,, where k and l are any integers whatever 

In the preceding sections we have been led to real formal series giving T k 
Zro rJuZ 01 k the C “ eS ^ n " ,0 Whil ' h «" ° th " oases 


123 



14 


George D. Birkhoff. 


The formulas 

(6) Uk(ui, n) — ti*+/(u, v). vt) — va+i(u, o) 

hold for all real values of k and l. 

The content of this statement is wholly formal of course. 

In the cases II", III' its truth is at once obvious. The equations (6) stand 
for an infinite number of ordinary polynomial relations between the coefficients 
rpik) fifik) q/n , •l£+ tt which are known to hold for all integral values 

of k and l. Since these coefficients are themselves ordinary polynomials in k, l, 
these relations hold identically. Similar reasoning, based on the lemma of § 7, 
shows that the statement is also true in cases I', II'. 

From the italicized statement thus established it appears that we have to 
deal with a one-parnmeter continuous group of formal transformations and that 
k is an additive parameter for the group. 1 In treating of its properties we need 
a few of the general formal ideas for such groups. 

We shall write formally 

• Out, • dv*| 

( 7 ) 6v ~ 7 k\k-o- 

so that wc have the following table: 

(I'), du-u log ? + ••*, du — — v log q + •••, 

(II'), du ——+ dt> — H-, 

W (II"). < 5 u-«p I#w » + ..., dt 

(Ill'), du = + dt> = du + •••. 

The series du, dv are real formal power series in u, v. 

The series «*, v* satisfy the formal differential equations 

(9) ^*-du(u 4 ,i»*), 22?— d®(«*, v k ), 

and the initial conditions w 0 — u, v, = t>; conversely u*. o* are formally determined 
by these equations and conditions. 

1 0. L. Bouton observed these facts in case II", loc. ciL 


124 




Surface transformations and their dynamical applications. 


15 


To begin with, by differentiating the first equation (6) formally as to k 
and noting symmetry, we find 

Vi) — -fcVk+liu, t>) = ^U,{Uk. Vk). 

Putting /->o and recalling the definition of du we obtain the first, of the diffe¬ 
rential equations (9). The second equation may be deduced in like manner. 

The initial conditions ti # — u, t>. — v are clearly satisfied. 

Conversely, if we write u*. v k as power series in u, v without constant term 
and with coefficients which are undetermined functions of k, and substitute in 
the differential equations, we get at each step linear differential equations of 
the first order in these coefficients. When joined with the condition that all 
of these coefficients are o for k — o, save the coefficients of u in u 0 and of v 
in v 0 which are x, these equations successively determine the coefficients. 

These facts explain the complete analogy between the classification of trans¬ 
formations T near an invariant point and the classification of differential equa¬ 
tions of type (9) at a point du — — o. This analogy was noted by Poincare. 


(ro) 


§ 9. The invariant operator L(to). 
We shall now define the invariant operator L(w): 


It in clear the L(w(u, »)) is the formal derivative of w(u„, v,| as to t for k-o 

L , <W> r l,ered ,f0^ma " y, by “ of va riablcs. The fund¬ 

amental property of this operator is expressed in the following statement- 

T 2‘, IZT- l and su " ieint condi,iona /o,ma! ,triu F be inmrian ‘ ««*»■ 

have F ft: Z'. r ,li r ( " T"‘ ry - In f#Ctl if r is an invaria ' lt -eries we 
_ ( “’ V) for ftl1 ,nte g ral va,u * s of *• Hence, by the lemma of 

vo,.. Mr . 3 . 

means of n limiting process bv S Lattls W L 1 “ nalo S. v waa explaine,! partially bv 

»..... ... „',jz ssisstvasan. 


125 



16 


George D. Birkhoff. 


§ 7, this relation holds for all values of k. Differentiating as to k and taking 
k — o, we find L[F) ■= o. 

Secondly, this condition is sufficient. For if L(F) — o wo find, using (9), 


d 

dk 


dFluk, Vk)du k 

Bvk dk 


F(uk. v k ) — -i ~ + 


OF[u k , vk) dtk 

if Vk dk 


— L(F(u k , Vk)) - o. 


Hence we infer that F(u k , v k ) is a power series with coefficients independent 
of k. Putting k-o we get F(u k , v k ) = F(u, v), and in particular F(u lt v t ) - 
F(u, v). That is, F is invariant under T. 


§ 10. Existence of invariant series. 


In §§ 2—9 the fact that T was assumed conservative did not enter, save 
that we made use of the equation (4). We shall now prove the following: 

Any conservative transformation T of the form /', //', II" or III' leaves in¬ 
variant a real formal series F* defined by the equations 


(ix) 


OF » 

ifv 


QOu, 


OF* 

du 


- Qdv. 


By multiplying together the equation (3) for u, v, for n —= v . 

for u — v =* v k -i, we obtain 


( 3 *) 


0(M, v) — Q(u k , v k ) 


0 u k 

du 

0 v k 

Tu 


0 u k 

Ov 

if V k 

Tv 


for any positive integral value of k. We employ the familiar rule for the 
combination of Jacobians in obtaining this result. Likewise (3*) holds for 
k = o and aleo for negative integral values of k, as is easily seen. 

Hence this relation (3*) will hold identically when the formal series for 
u k , v k are substituted. This follows from the lemma of § 7. 

Differentiating with respect to k and setting k = o, we find 

(12) 0 ^T u {Qdn) + 7 v {Qdv) ' 


126 



Surface transformations and their dynamical applications. 


17 


Here we have employed the definitions (7) of Ou, Ov and wc have made use of 
the fact that the Jacobian determinant reduces to 

1 o 
o 1 

for k — o. Now consider the terms of a particular degree in QOu and — QOv. 
These homogeneous polynomials p and q have the property 

dp _ Oq 
itu dv* 

deduced from (12). Hence there exists a homogeneous polynomial r of degree 
one higher such that 

Or Or 
P ~dv t qam 0 n\ 


The sum of the polynomials r of all degrees (>2) is the formal series F* 

required. 

From the equations (ix) we have immediately />(/•*)-0, so that by § 9 
the series F* is formally invariant under T. 

If a change of variables from u. v to U, V be made, the series F* for the 
new variables can be obtained by direct substitution. For. from the equations 
(11) wc find 


OF* OU 

OF* OV 

r On 

OU Ov + 

0 V dv - V 

L ou 

OF* OU 

OF*OV 

r ov 

OU Ou + 

OV du 

JO 


iU + P‘v av \ 


Multiplying the first of these equations by pL. and the second bv a,id 

adding, we find ‘ " V 

***, n [0u Ov Ou dvl Mrr 
OV V L dUOV 0VdU \ dU * 

But the quasi.invaria.it function for the new variables is the product of <?(,, 
and the Jacobian of «. «. as to V, V (§ ,). Hence the equation last written 
shows that F*(u. V). regarded as a formal series in U, V. satisfies the first 
equation (u) for the new variables. Similarly the second equation (,i) is seen 
to hold in these variables. 

Aria malhtmalica. 43. Inprio^ U 18 m.r, l»». 


127 



jg George D. Birkhoff. 

From the equations (8) and <u) the explicit forms of the series F* are 
immediately evident: 

(I'), F* — uv log g + •••, 

(IIO, F*-®(u»+ «*) + —. 

(I3 ‘ (ID, F* - 1 *>»«• + —. 

(IH-). r»—**•+-. 

It is apparent that any formal power series in F* furnishes an invariant 

serein order to determine to what extent the existence of formally invariant 
series for a transformation I', IT. II". Ill' is characteristic of conservative trans- 
formations we need to make a digression. 


§ xi. Factorization of formal series . 1 

We consider formal series without constant terms. Such a series will be 
called prim, when it cannot be expressed as the product of two others. Since 
the lowest degree of any term in a product is the sum of the lowest degrees 
for any terms in the factors, any formal series can be decomposed into prune 
factors in at least one way. and the number of such factors cannot exceed the 

degree of the initial terms of that series. 

Two factors, either of which can be obtained from the other by multipli¬ 
cation by a formal series with constant term, are regarded as essentially equi¬ 
valent. Since products and quotients of formal series with constant terms yield 
series of the same type, the propriety of this convention is obvious. 

By a linear change of variables any scries G(u,v) can be given the form 
cv n + •••.cho, where the indicated terms are of degree at least n. Any pos¬ 
sible factor of G is readily seen to have the same prepared form. Also Weibr- 
strass's factorization theorem holds formally, i. e., we may write G - EH where 
E is a power series with constant term c and H is a power series, v" + •••, w 
which t> does not occur with an exponent as large as n after the first term. 

Now let us determine the formal series s(u°) in powers of which satisfy 
the equation tf-o, and let us proceed at each step of this determination pr e- 

“ Cr. W. F. Osgood. Factorization of analytic function* of tectral variable*, Annal* of Mathe¬ 
matic*, vol. 19, 1917— 19 «*- 


128 



Surface transformations and their dynamical applications. 

cisely as though H(u,v) were a polynomial in u,v. The well-known method 
for doing so yields higher and higher terms of such series, with ^a = n. 

At first sight it might seem conceivable that this process breaks down at 
some point so that it is not possible to proceed further. But. since the process 
used involves only a finite set of terms of II at each stage, the same difficulty 
would necessarily arise if H were broken off at some advanced term. This is 
absurd since then we are dealing with a polynomial. Thus we obtain a contra- 

i 

diction. Consequently we can obtain formal series of the stated type in u a 
which, when substituted for v, reduce // to o. The initial terms in these power 
series are at least of the first degree in u. 

Let ut be any nth root of i and consider 

n[i«- -s(wu"))]. 


This product is precisely //, at least if H is a polynomial in u as well as in v. 
By breaking off H at an advanced term and employing a limiting process, we 
infer that the same is always true. 

The bracketed products involve only integral powers of u ns well as of v, 
and are prime factors of G. Indeed, if such a product P is not prime, its 
component factors are of prepared form and may be decomposed os G has 
been. But any new series S so obtained must fail to reduce P to o when we 
write v — S. This is absurd. 

For a similar reason it appears that, if a prime scries divides a product, 
the series must divide one of the factors. 

It follows that, as far as the fundamental theorems of decomposition are con¬ 
cerned, the situation for convergent series carries over directly to divergent series. 


§ 12. Condition for conserrativeness. 

We arc now in a position to prove the following: 

A necessary and sufficient condition that a transformation T given by real 
series /', IP, II", IIP (but othenrise unrestricted) be conservative is (i) that there 
exists a real invariant series F of lowest terms one degree higher than those of 
du.dv and containing each common prime factor of du, dv to precisely one power 
higher than tt appears as a common factor in du.dv, and (2) that the formal 
jiower series given by the equal ratios 


129 



20 


George D. Birkhoff. 


OF OF 
dv <>U 
du' dv 


converges. . ' 

Before entering upon the proof, it may be observed that an inspection of 
du.dv as given by (8) shows that, in the cases I', II'. and iv have no 
common factor. In these cases the condition (i) reduces to the condition merely 
that there exists a formal series F with lowest terms of the second degree. It 
will appear later that i« and iv admit of a common factor only in the extra¬ 
ordinarily special cases II", III' when there exist curves through (o, o) made up 
of invariant points. 

We first prove the conditions necessary. 

We take F — F*. The equations (n) show that this invariant series has 
lowest terms of degree one higher than the terms in du, dv of least degree, 
inasmuch as Q possesses a constant term. 

From the equations (n) it follows also that the ratio series of the italic¬ 
ized statement converges to Q. It remains to show that F’ contains the 
common prime factors of iu.iv to a power one higher than these occur ns 
common factors of du,dv. 

Let P k be the highest power of any such prime P occurring in du,dv. 


By (ix) we have 


OF* 
Ou 


- P*a . 


OF* 

dv 


-P*6 


where either a or 6 is prime to P. 

OF* , OF* ... 

If F* contains P to higher than the (k + i)th power, — and — will 

contain P to higher than the Hh power. This is in manifest contradiction with 
the equations last written. 

If F* contains P to a power m with o < m < k + i, and if we write 
F* — P’"G, we find 

m 8 r +p n" p ‘ , '" fl ' 

mG if + p'12 _ i*♦•—6. 

Hence, since G is prime to P, both and are divisible by P. At least 


130 



Surface transformations and their dynamical applications. 21 

one of these partial derivatives is possessed of initial terms of loner degree than 
p, so that this possibility is likewise excluded. 

The statement under consideration is certainly true then unless, perchance, 
F . is not divisible by the prime factor P. We have merely to eliminate tins 

possibility. 

It was seen in the preceding section that we can write 

P = E MU")) , 


w |,en E is a formal power series with constant term, where 5 in an ascending 
power series in its argument, and where to stands for any nth root of x. 

Now introduce the variable I — u" instead of ft. We have 


OF* 

•it 


nl u-l 


•IF* 

•iu 




while is unaltered. Hence the partial derivatives ' ? and ^ are divis¬ 

ible by v — S(t). Lot us effect a further change of variables from v,l to to, z 
where w — v — £(/), * — t. Evidently one has 


dF* OF? OF? _ OF? •>F*d8 
iito ~ Ov * Ox ot Ov 777 


so that and are divisible by to. 
Ow Ox 


The fact that —- is divisible by 

alone and is divisible by tv. 

Passing back to the variables v, t, 
scries in t is divisible by v — S(t). 


w shows that F* contains no terms in z 

wc infer that F* expressed as a power 
It follows that F*(u, t>) is divisible by 


v — s(n") and by P of course. This completes the proof that the conditions 
stated are necessary. 

It remains to prove them sufficient. 

We may assume that an invariant series F* exists for which (n) holds in 
which Q is a convergent power series with constant term i. These equations 
follow at once from the second part of the italicized statement under considera¬ 
tion. Our aim is to show that T is conservative. 


131 



22 


George D. Birkhoff. 

By direct differentiation and use of the formal differential equations (9) 
we obtain 


d 

dk 


duk Buk 
du dv 

d_Vk 0_vk 

du dv 



duk 

dvk 

Q{uk, vk) 

du 

dv 

d Vk 

dVk 


du 

dv 


-Pft " 1 <“(“*■ °‘ ) + --£r t)sv,ui - ’-'I 


+ 0(«», »*> 


ddu(vk, Vk) 

ddu(ui. v k ) 


dvk 

duk 

du 

dv 

1 

du 

dv 

dvk 

dvk 

+ 

dSv(vk, Vk) 

ddvluk. Vk) 

Yu 

Yi 


du 

dv 


But the first determinant in the final brace is the Jacobian of du(u». »*). v» 
with respect to u,v. This determinant may be broken up into the product of 

the Jacobian of Su(v>. v t ). as to «*. « (which is and the Jacobian 

of Vk.Vk as tou.v. Likewise the second determinant in the same brace may 

be expressed as the product of - and the Jacobian of «*. n as to «. v. 

Hence we find that the right-hand member of the above equations reduces to 

dxi k du^ 

1 a d \ du dv 

( 5 L W (, 1 .«,a,* J + 5 74 W ( «..».)dn ] [ gvt 

du dv 


The first factor vanishes identically by (u). Hence the left-hand member of 
the above equation vanishes identically in *. Integrating formally we obtain 
(3*). For *-x this becomes (3), which is precisely the condition that T be 

conservative with a quasi-invariant function Q. 

It is natural to call a transformation T of types T, II', II", HI' formally 
conservative if there exists a formal series F satisfying the conditions in part(i) 
of the italicized statement. 

We may inquire precisely what condition the existence of formally in¬ 
variant series lays upon transformations T of these types. The ratio Q of the 


132 



Surface transformations and their dynamical applications. 23 

italicized statement may or may not be convergent. If it is convergent, 

then J j Q(u, v)dudv is invariant under T. If the ratio is not convergent, the 

double integral is only formally invariant. 

These considerations bring out the vitally close connection between conserv¬ 
ativeness and formally invariant series. 


§ 13. The formal vanishing of the Jacobian. 

To complete our treatment of formally invariant series we need to establish 
the formal extension of a well-known property of Jacobians: 

The Jacobian of two formal series in u, v without constant terms vanishes 
identically if and only if either can be expressed as a power series in the other or 
in fractional powers of the other . 1 

It is immediately apparent that, if two functions A, B ore so expressible 
one in terras of the other, their Jacobian will vanish identically. 

Suppose, conversely, that A and B are power series in u, v with vanishing 
Jacobian: 

9A OB 9 A d_B _ 

Hu Hv dv Ou "" 


Both A and B are exact powers of base series for which it suffices to establish 
the functional relation. But the Jacobian for the bases also vanishes. Conse¬ 
quently we may confino attention to the case in which neither A nor B is an 
exact power other than the first. 

We begin by showing that A and B have the same prime factors. 

If this is not the case, suppose that A is divisible by a prime series P, 
while B is not. After a suitable preliminary change of variables. P is expres¬ 
sible as a product of series v — S(u n ) (§ 11). Now take new variables 

w — t>— S(u m ), t - u*. 

The series A and B are power series in these variables without constant terms, 
and their Jacobian as to w, t is o by direct reckoning: 


• The p resell re of fractional powers means that the root indicated is to he formally 
ox traded. 


133 



24 


George D. Birkhoff. 


0 A 0 B _ d Ad B n 

dw dt di dw 

But A is divisible by .r, and *jf is divisible by w to a power at least as high. 

Also i A is divisible by «o to a power at least one lower than A. Hence 
dw 

is divisible by w. From this it follows that B is divisible by w. 

Proceeding to the original variables we infer that B is divisible by the 

prime faotor P, contrary to hypothesis. 

Supposo that a prime factor P is contained p times in A and q times in B, 

and choose that factor for which ? n o is as small as possible, and thus smaller 

than for some othor factor unless | is the same throughout. Except in this 

case, A * p will yield a power series without constant term and not containing P. 
But the Jacobian of this series and A is easily verified to be o also. This is 
not possible by the argument used above, since ^ has not the prime factor P 
which A admits. 

We are thus forced to tho conclusion that the power series fi|> starts off 

with a constant term. But A and B are not exact powers so that we must 
have p-tf. Consequently the prime factors of A and B occur with the same 
multiplicity in A and B. 

Now consider 

A = B(c + C), (‘"o). 

where C is a power series without constant term. It is readily inferred that 

C q 

the Jacobian of C. B is o, and thence that, if C is an exact «th power, g is 
a power series with constant term. Hence we may write 

C-£*(d + Z>), id*o) t 

where D is a power series without constant term. Proceeding in this way in¬ 
definitely we find 

A =cB +dB* + •••. 

This establishes the statement. 


134 



Surface transformations and their dynamical applications. 


25 


§ 14. The totality of invariant series. 

We may now prove the following: 

If F* is a qth power the most general invariant series is an arbitrary power 
1 

series in F**. The integer q is 1 unless all the prime factors of F * are common 
to du, dv. 

The results of § 13 assure us that tho most general invariant series can be 
represented as stated if the Jacobian of F * and any invariant series F vanishes. 
But we have L{F*) — o, L(F) = o, whence it appears that the Jacobian does 
vanish. 

If q*i we may write F* — and (11) gives 

so that nil of the factors of G (and hence of F •) are common to du and dv. 


§ 15. Conditions for Formal Conservativeness. 

• 

At the very outset of tho paper the condition (4) was obtained as a conse¬ 
quence of the fact that T was assumed to be conservative. There exist an in¬ 
finite set of similar conditions on the coefficients of higher degree terms in the 
power series u, and v,. These conditions may be found by use of the existence 
of invariant formal series. We illustrate the method in case I'. 

Since F * begins with a term uvlogp in this case, an invariant series F, 
also with first degree term uv log p, can be written down without any other 
terms having equal exponents in u, v: 

F — uv log p + %F m „*"*», (mxn). 

m + n —3 

This series F may be obtained by writing F — F* + cF** + •••, and choosing 
the arbitrary coefficients so as to eliminate terms with equal exponents. 

Moreover, it is easy to see that there is only one such series, since any 
invariant series can be expressed as in a power series in F* (§ 14). 

Now, when coefficients of u m v n are compared, the formal relation F(u,, *•,)=- 
F(u, v) gives a series of equations 

At(a malhtmalica. «. Iroprlm* lo IS mart IOTO. 4 


135 



26 


George D. Birkhoff. 


(m + n >3). 


Here P mn is a linear expression in the quantities F„s with a + p<m + n. Thus 
we determine F mm . for m + n - 3. m + n - ..aa polynomials in the coeffi¬ 

cients <p mn , H>mn Of the series for u,, v,. For m — n we have P mn - o. 

In the cast I' the polynomials P n n in fp»t, Vat (a + p <2 n) vanish for n = 


2* 3f • • • • 

’ Conversely, if these vanish we have a formally invariant series F t and 
formal conservativeness of T in consequence. 

Similar conditions for formal conservativeness can be found in the other cases. 


§ x6. Invariant formal corves. 

Let / and g be two formal power series in a parameter t, without constant 
terms and not both identically o. Then we shall regard the equations 

* — /(*)» v — 

as furnishing a formal curve through the point (0,0). If the series /, g converge 
for |f| small we have an analytio curve. 

Two ourves of this sort will be regarded as identical if one can be obtained 
from the other by change of parameter < —/(r) where l is a formal power series 
in r or a fractional power thereof. 

A formal curve is regarded as real if the coefficients in / and g can be 
taken real. 

By means of T a formal ourve of this sort is regarded as carried over into 
the formal curve 

*-*.(/(!). g{i )) 9 »-«,(/«).*(<))• 

If this transformed curve is identical with the given curve u = f(t), v- g(t) then 
the given curve is said to be formally invariant under T. 

The determination of the formally invariant curves is essential for our 
purpose. A fundamental division of types of invariant points will be made 
according as there do or do not exist curves of this sort given by real series. 
In cases I', IF, II", III' the transformation T will be called hyperbolic if 
real formally invariant curves exist, and elliptic in the contrary case. In cases 
IT" or III", T is hyperbolic or elliptic according as T q or T, (of type II") 
is one or the other. 


136 





Surface transformations and their dynamical applications. 27 

If t t denotes the power series in < or a fractional power thereof along the 
transformed invariant curve which relates its parameter and t, wo have 

/('.) -«,(/(*>, 9(1)), g(t,) - *>,(/«). 9(1)). 

In virtue of the fact that the determinant of the coefficients of the first degree 
terms in v, is not o (sec (4)) we can show that the power series l, starts off 
with a first degree term in t. For suppose it commences with a term of higher 
degree. The initial term of one of the two right-hand members above will ben, 
where a is the lowest degree of any term in / or g. Rut the left-hand members 
will start off with higher degree terms, which is impossible. Similarly we may 
rulo out the possibility that the initial term in t is of lower degree than the 
first, by making use of the inverse equations 

f(t) - u_,(/«,), g(l,)) t g(t) ~ !>_,(/(/,), g(i t )). 

Hence /, is a power series in I or a fractional power thereof beginning with a 
term of the first degree. 

If a is the degree of the lowest terra in f or g (say in /). then from the 
corresponding equation (the first) we obtain on the left a series in /.,«/'• + •••, 
and on the right a similar series in / commencing with a terra of degree not 
less than a and therefore of degree precisely « by the above. Extracting a th 
roots wo conclude finally that I, can be expressed as an ordinary power series 
in t with first degree term: 

Having this explicit form of t in mind, let us compare anew the two 
members of each of the pair of equations first written. We write 

/(0-p<-+ •••. 9(1) 

so that |p| + |?| h o, and obtain 

pe* a -ap + bq, qe*«-cp + d 7 . 

It follows at once that ?*“ is a root of the characteristic equation, i. e. that 
0 *“ - Q- 

If (0,0) is an ’ordinary point’ of the formal curve we have w - 1, q = e *. 

By successive transformation of the invariant curve by T, we obtain not 

only/, but parameters /,. Likewise by the inverse transformation we 

obtain parameters /_ 2 . These can all be obtained from the series for /. 

by iteration. 


137 




28 


George D. Birkhoff. 


§ 17. The formal series for t k and the formal group. 

Since the constant f is an ath root of it is clear that, if we write 
r,(r) = / 0 (r) f then we have r, - e r + — By iteration x k may be defined for all 
integral values of k. Moreover, the methods used in § 4 serve at once to show 
that 

r* — p* r ¥ 2 f f*m 

where is a polynomial in of degree at most m if q * x, and a polynomial 
in k of degree at most m — x if q “ x. 

For all integral values of k and l we have obviously 

r*(ri) - r**|. 


Therefore, by the lemma of § 7. this holds formally for all real valueB of k and l. 
We write 


dr — 


di k 

dk 


fc-o* 


and can then show (compare with § 8) that the formal differential equation 


dr k 

dk 


— dr(r*) 


ia satisfied, and, together with the initial condition r.-r, wholly determines 
the seriea for r*. 


§ 18. The Invariant operator L{u,v). 

We shall define a second invarisnt differential operator: 

L{u, v) — dudv—dvdu. 

It can be immediately verified that, if the variables u,v are changed to 
u, v, then TAu, v) becomes L(u, v) multiplied by the Jacobian of u, v as to u, v. 
It is also obvious that, if «-/(<), *-*(«) * a formal curve - then L < u ’ v) ,B 
independent of the particular parameter chosen for the curve. 

The necessary and sufficient condition for the invariance of a formal cum 
u —/(<), v — g(t) under T is L(u, v) — o. 


138 



Surface transformations and their dynamical applications. 


29 


By definition of invariance we have for such an invariant curve 

/(*.) - «.(/«). 1 7 ( 0 ). 0«.) - r,(/«). 0(0). 

and thence for integral values of it 

/('*) - *(/«). 0(0). 0(**) - **(/«). 0(0). 

If we take k as an integral multiple k'a of « (§17) and write f —r, r, —/«,(*') 
(§ 17), we have in particular 

/(**) - v* a (/(T), ?(r)), g(rk ) - r*V/(r), ?(»)), 

for integral values of V. 

Let the general series for */*•„, r*„. r*-, be substituted in the last equations. 
All the coefficients are either polynomials in , p - *, Xr' (case I'), or in cos k'O, 
sin k'O, k' (case II'). or in**' (cases II", III'). Hence, by the lemma of § 7, these 
equations are identically true from a formal standpoint. 

Differentiating formally as to k’ and setting k' — o, we get 

dr drsa °duU.g), j^dr - adv(f, g), 

whence at once L(u, t>) — o. 

Conversely, let us assume that L(u,v) is o for a formal curve u — /(l), 
v — 0 ( 0 , and let us show that the curve is invariant under T. 

In this case wo have 

*(0 

where x is the sura of a polynomial ini and a power series in t. Now, since 

du, di> begin with terms of the first degree or of higher degree, both right-hand 
members have initial terms of degree at least as high as / or g. On the other 

hand j t and are of degree one less than f and g respectively. Hence x(i) 

cannot contain negative powers of t or even a constant term. Thus x(l) is an 
ordinary power series in t without constant term. 


139 



30 


George D. Birklioff. 


Define t k by the differential equation 




and the initial condition t.-t. Thus l„ is formally determined as a power series 
in t with coefficients analytio in k. 

For example in case I', < 5 u and dv are given by (see (8)) 

du = u log £ + • • •» dv= — v\oge + --’. 

Hence an inspection of the above equations introducing x(t) shows that this 
function possesses a first degree term in t, —^ « an integer. 

Write then 

x( ,)-!&*/ + 2 


and the differential equation gives 


on comparison of terms in . Remembering the initial conditions <>-i, 

</f- 0 .we find 

9f-r.fi>-is i^*- 

A 

Thus the successive coefficients are polynomials of increasing degree in <? a . 

Likewise in case II' these coefficients are polynomials of increasing degrees 

in cos M sin M ; and in cases II". Ill', polynomials in k only, since here *(<) 
a a 

starts out with a terra of the second degree or higher. 

Consider now the formal series /(/*) and g(t k ). Differentiating and using 
the definition of /*, we find 



140 





Surface transformations and their dynamical applications. 


31 


-'iJT - << 57r x( '* ) - *»WM. ?<'*»• 

These series /(f*) and ^(f*) reduce to /(f) and 9(f) for fc — o. 

Consider next the formal series «*(/(!), 9(f)), t>*(/(/), g{t)). Differentiating 
and using (9) we find 

j ~ k («*(/(/), 9 ( 0 )] —*«*[«*(/(<)» 9 ( 0 ). »*(/(*). 9 ( 0 )], 

^ [**(/(<). 9 ( 0 )] - av|M(/C 0 * 9 ( 0 ). n(/( 0 . 9 ( 0 )]. 

Also these series reduce to /( 0 , 9(0 for & *— o. 

Hence, if either pair of series in / be denoted by 754(0, quit), the differential 
equations 

^jr“ Su(pa. q k ), — <Jw(p*. 94), 

and the initial conditions p 9 — /«), 9* — 9(0 will be satisfied. 

But, just as in an analogous situation earlier, these equations and conditions 
uniquely determine the series. Hence the two solutions coincide: 

f{tk) - «*(/(#), 9 ( 0 ). 9 (/a) - n(/( 0 , 9 ( 0 ). 

Taking As — 1, we conclude that the given formal curve is invariant under T. 

§ 19. Existence of invariant formal corves. 

When a formal power series in u, v without constant term is resolved into 
its prime factors in the sense of § 21, each such factor evidently corresponds to 
a formal curve u-f". v-8(t) where S is a power series in f. When the 
coordinates of this curve are substituted in the given formal series in u, u, it 
vanishes identically. Conversely, if the coordinates of a formal curve render 
such a series equal to o, then it renders one and only one of the prime factors 
equal to o, and this formal curve must be the one corresponding to the factor. 
With these facts in mind we may prove: 

The totality of formally invariant curves for a conservative transformation T is 
given by the equation F~ o, where F is any invariant series under T. 


141 



32 


George D. Birkhoff. 


First, let us take any curve for which F —o. Now we have F(«*. *>*)— 
F(u, v), and thence, by formal differentiation as to k and taking fc — o, 


But we have also 


^■ au + || io _o. 




Combining these equations we find L(u,v) — o. By the preceding paragraph 
the formal curve is invariant. 

Conversely, for any invariant formal curve u —/(<). git) we have 
/«*) - «*(/(<). £7(0). £7(f*) - *</<0. £7(0). 


as we have seen. Hence it follows that 

F(/(f*), *«*))- *T«*(/(0, £7(0). •»(/(«). 17(0))-*</«. £7(0). 

Now, taking Ic — lc'a, 7 a —*. we may regard this equation as holding for all k! 
(§ 17). Differentiating as to V and taking fc' — o, we find 


df 

L«7u Or 


+g» 


dr — o. 


Unless dr — o we infer that ^j-o. But F(/(l), ff(Q) is a power series in t 

without constant term. Hence except in this case we have F(/(f). ?( 0 ) — o. as we 
desire to prove. 

However, if we take the equations which state that u */, f- g, and its 
iterates under T coincide (as written above), and differentiate os to k'( Ar—&'«), 
we find for k' — o 

d i'di — du(/, ?), j|dr ~dv(l,g). 


Hence dr vanishes formally if and only if du(/,g), dvif.g) vanish. In other 
words the given curve corresponds to a common factor of du, dt>. But it has 
been proved (§ 12) that such factors occur to a one higher power in F . Hence 
we have F(/(<). g{t)) — ° in this case also. 


142 



Surface transformations and their dynamical applications. 


33 


Applying the above condition to F* (see (13)), we perceive that in case I 
we have two real formally invariant curves so that T is hyperbolic, while in 
case II' we have a pair of conjugate imaginary formally invariant curves so that 
T is elliptic. 


§ 20. Invariant point curves. 

In an extremely special case the invariant point (o, o) may not be isolated 
but may lie on one or more analytic curves of invariant points passing through 
(o, o). These curves can be determined as the solutions of the ordinary equations 

— u, v,(u, t>) — v. 

By iteration we get 

»fi(u,t;) — u, v A (u,v) — v, 

which holds along these curves. Differentiating as to k, as we have often done, 
and setting fc — o, we find 

du — o, dv — o, 

along the invariant point curve. In other words the invariant point curves 
correspond to common factors of du, dv. According to § 12 this means that 
the curve corresponds to a multiple factor of F *. 

Conversely, let us assume that F* has a multiple factor corresponding to a 
formal curve u—/(f), v — g(t), so that du — dt»-o along the curve. By formal 
integration we get u*(/, ?)-/, v k (t,g)-g, and the formal curve is an invariant 
point ourve. 

There exist formally invariant point curves if and only if F* has a multiple 
factor, and these curves are then analytic curves given by the equations u, — u , 
t», - w. 


§ 2i. Normal form. Case I'. 

Under a formal change of variables from u.v to U, V such as 
( x 5> r". t'-f + j; V mH **tr, 

m+n-? 

transformations of the type I'. II'. II", nr evidently maintain their type, and 
also remain formally conservative if they are so at the outset. 

Ada mai/irmaliea. 43. imprint l« 19 man i«q - 


143 



George P. Rirklioff. 


34 

We propose to develop a normal form for. the transformation T in the 
cases I', II'. In the other cases there appear to be an infinite number of invariants, 
and a similar normal form does not exist. 

By a formal change of variables (75), a formally conservative transformation of 
type I' may be given either the normal form 


(16) U.^vUt'U'r', V.-iFe-'*'' 1 '. |«'o|, 

or the form 

<«<•') c/.-ei/. f,-1f. 


We propose first to choose U, V so that 

dU-U\i + f[UV)] log?, «J V -F[i+0(£/F)]loge. 

where / and g are power series in their argument UV. More explicitly written, 
these equations take the form 


(17) 


+ i’ 08 *' 

- ~ K ti + 9 W F»] log f ; 


recall the equations 

. Ok-V^k.vk), 

OK k -0 

and similar equations in V. 

By the first equation (8) the first degree terms on both sides of the above 
equations are the same. 

Equating coefficient of u m V n in these equations, we find 

(m —n —x) U mn 
(m —- n +1) Vmn 
o 

o 


-p-*. (m F-n + 1), 

- Qmn. (n^m + x), 

“ Pm+Un +/*»♦ 1 » 

“ Qn.n+ 1 + <7>n+It 


144 



Surface transformations and their dynamical applications. 


35 


where P mn . Qmn are polynomials in U a9 , V a ,, f r , g r with a + (i < m 4 n, y <m + n, 
and where /**+1, ^2*+i are the coefficients of (U V) k in / and g respectively. These 
equations are manifest as soon as the explicit series for U, V arc substituted 

in (17)- 

Let us compare second degree terms, so that m + n —2. The first two 
equations determine U mn . V m „ for m + n ~ 2 uniquely. 

Next let us compare third degree terms so that m + n -3. Here the 
quantities U mn , V„, n , excepting U u , V lt , are determined by the first two equa¬ 
tions while /„ g t are determined by the second two equations. 

Continuing in this way we determine in succession ' U mn , V mn ./,„g p , save 
for U u , K„, U it , which can be taken arbitrarily. 

Therefore it is possible to determine formal series U, V so that (17) holds. 
In order to avoid complexity in our notation let us call these new variables 
u, v. It may be observed that the set of changes of variables (15) form a 
group. Accordingly, in accomplishing the desired normalization, we can compound 
any number of such changes of variables. With this understanding we may 
write 

<Ju - u[i + log q, d®--»[x4 g(uv)J log Q . 

If Q denotes the formal quasi-invariant function, we have by (12) 

+ /))-/- [«•>(.+ ?)J 


on substituting in the above values of iu.iv. Here / and g are series in the 
product uv. 

It follows from the equation just written that Q must also be a series in 
the produet uv. Suppose if possible that this is not the case, and let du"V 
bo a term in Q of minimum degree for which m*n. A term (m + i)du" W will 
then appear on the left of the equation written. But no other term of equal 
or lower degree in which the exponents of «. « are unequal can occur on the 
left inasmuch as terms with unequal exponents are not present in /. A similar 
term l. + Orfu-i, will occur on the right. If then the above identity holds we 
must have d —o, contrary to hypothesis. 

Thus if we write z-uv, and use accents to denote differentiation with 
respect to 2 we have easily 


[Qz( 1 + /)]’- [Qz(i + gfl 


145 



George D. Birkhoff. 


By formal integration we get then / — g = h, where A a formal power series 
in * without constant term. Consequently we have 

du — u[i + A(ut>)] log q, dv •= — t»[i + A ( up )] log q. 

When use is made of this fact, the formal differential equations (9) for v k , 
vi, take the form 

= u k [ 1 + A(u* v*)] log ?» ^ - - nil + *(«*«)] log q. 

Hence, if we consider the product series u*r*. we have = o. Noting that 

we have ti 0 — u, t> 0 — v, we conclude 

If we substitute this value for u k v k in the differential equations, these 
become 

-jj* — "a(i + A(ttt>)] log ? — o, ^ + t>*[i +A(ut>)) log f — o. 

If we multiply these two equations by 

respectively, the left-hand members bec6me exact formal derivatives. Integrating 
formally we find 

e _o+Muo))A Uk — const., $»♦*<«*•»* v* — const., 

where the constants are power series in u, v with coefficients independent of k. 
Employing the initial conditions u 0 - u, v c -1>, we get the following explicit 
formulas 

u* — t>* — Q- k ve~ l,{uv)k t 

for the given transformation after the change of variables determined earlier. 

If A vanishes identically, a reduction to the normal form (16') has been 
effected. 

In the contrary case we may write 

A(ut>) — cuV-h•••. 


146 



Surface transformations and their dynamical applications. 


37 


Dividing by cho and extracting fth root, we find 

when p(uv) is a power series in uv with constant term x. If we define a further 
change of variables (15) 

U = u Vp(uv), V — v Vp(uv), 

we obtain immediately the first normal form (16). 

The normal forms are clearly of integrable type with UV invariant. 

It is apparent that, if T is given by real series, the normalizing series U, V 
can also be taken real. 


§ 22. Generality of normal form. Case I'. 

The normalizing series U, V were not uniquely determined. The most 
general set U •, V* of such series is related to any particular set U, V as follows: 
The most general normalizing variables U*, V* in case V have the explicit form 

(18) !/•- Ut^ v \ F* - 

where A is an arbitrary power series in UV without constant term, and U, V are any 
particular set of normalizing series. 

Clearly we can pass directly from U, V to U*. V • by a change of variables 
(15). Since the invariant curves £/— o, V — o arc carried into £/* —o, F* — o, 
we infer further 

f/*_ £/(, + ... )t F*-F(x + .. ). 

Now the products UV, £/* F* are invariant under T. Hence (§14) U* V* 
is given by a power series in UV, whose initial term is UV of course. By the 
aid of this result we may conclude that in the series for U*, V+ only terms in 
UV occur in the parentheses. 

In fact, if we replace U. V, (/*. V by U„V„ {/,*, V* respectively, the 
first of these equations gives 

qU* e* ^ + •••). 


147 



38 George D. Birkboff. 

On the left the exponential factor is a power series in U V with constant term i 
inasmuch os U* F* is given by a power series in U V without constant term. 
Suppose if possible that a term dU m * 1 F" (m r* n) occurs in the series for U * and 
let this term be of the minimum degree. On the left of the equation Ia9t written 
the corresponding term of this type is qdU m + x F", whereas on the right it is 
e m+, “ n d£/ m + 1 F". The two terms to be compared cannot be equal so that a 
contradiction results. In this way the parenthesis in the series for U*, and 
likewise that in the series for F*. are seen to only contain terms in U V. 

We may now write 

!/• = U(i + i'[UV)), V* — F(i + X"(UV)) I 

where A', A" are power series without constant terms. 

Replacing U, V, U*. V • by U t , I',. U*, V* respectively here, we get 

e V* v+ csQ Ue<v , ' i [i + A '(U V)) t 

and also a companion equation. Bearing in mind the form of U*, we conclude 
at once c* — c, /• — / and then U* V* — U V. This yields the relation stated 
between U*. V* and U, V, as well as the additional result: 

The integer l and constant c are independent of the normalizing series employed. 
Thus l and c arc the only invariants. In the case of the normal form (x6‘) 
we write Z— oo, c — o for convenience. 

Conversely, it is at once shown that any change of variables from U, V to 
U*, F* yields normalizing variables. 

§ 23. Normal form. Case II'. 

It has appeared, earlier that cases I' and IF are of the same formal character 
in the complex domain. This i9 evident if variables 

v — w — V —11> 

are introduced in case II’, when we have 



Moreover in case II' we have ^1 for any integer k * o. Consequently the 
same formal manipulation of the variables u,t> is possible as for u, v. Moreover 


148 



Surface transformations and their dynamical applications. 


39 


changes of variables (15) of u, v yield changes of variables (15) of u, v. Keeping 
these facts in mind, we deduce without difficulty the following important result: 

By a formal change of variables (15), a formally conservative transformation of 
type II' may be given either the normal form 

U t - U cos (0 + c(U' + V*) 1 )— V Bin (0 + c{V* + FV), 

(19) 

V, — U sin (0 + c(C/“ + F*)*) + F cos (0 + c(U* + F*/). 

or the form 

(19') U t — U cos 0 — V sin 0 , F, — U sin 0 + F cos 0 . 

Also, on account of the possibility of preserving the conjugate relation of 
the series u, v employed at every step of the formal work (so that u, v arc real 
series), we conclude that, if T is given by real series, the normalizing series U, V 
can also be taken real. 


§ 24. Generality of normal form. Case II'. 

Likewise in analogy with § 22 for case I' we find: 

The most general normalizing variables V •. V • in case IV have the explicit form 


(20) 


£/•— £/cos *(£/•+ F 1 ) — FsinA(tf*+ F»). 
V - U sin X(U* + V) + V cos k(U‘ + F«), 


where X is an arbitrary power series in U * + V* without constant term, and U, V are 
any particular set of normalizing series. 


§ 25. The Integrable case. 

The formal series u k , v k used in the preceding part of the paper may con¬ 
verge. Suppose that these series converge uniformly for | u |, 1 1>|, | k | sufficiently 
small. By the definition of du, dv as derivatives of t/*, r* respectively as to lc 
for k — o, we see that in this case du, dv are given as convergent series. Conse¬ 
quently the formal differential equations (9) are of the ordinary type with du, 
dv analytio functions of u, v vanishing for u = t> — o. It follows that u h , v k 
converge uniformly for \k\<K, an arbitrary positive quantity, if |u|,|v| are 
sufficiently small. It is then that we speak of u k , v k as convergent series. 


149 



George D. Birkhoff. 


40 

A necessary and sufficient condition for the conrergence of the series tu, v k 
(as specified) is that the corresponding conservative transformation T he integrable. 

The fact that the integrability of T is necessary is proved at once. In 
the convergent case the formal differential equations are of the ordinary type 
as noted above. Consequently the formally invariant function F* defined by 
means of the equations (n) is an actual invariant function. That is, T is 
integrable. 

To prove the sufficiency is not such an easy task. Let F be the given 
invariant analytic function. Every invariant series can be expressed as a power 
series in F or in fractional powers thereof (§ 14). In the latter case F' is an 
exact formal 7th power if the 7th root is to be extracted. And furthermore 
this root is of course also given by a convergent invariant series. Hence, without 
loss of real generality, we may assume that the invariant formal series F • is a 
formal power scries in F' i. e. F+ — tf>(F'). 

Now write 


. , . du\ 9Fiu't.v' k ) 

«(“*■»*>- rf T- iVi - 


nt„' \ dt/k il F(u'i,v\) 

Qi<i„.v t )- dk - 


where Q is a quasi-invariant function belonging to the conservative transforma¬ 
tion T. The differential equations so defined, joined with the initial conditions 
u' # — u', v\ — t/ determine convergent power series u'*, v\ which converge uni¬ 
formly for |*|</C (K arbitrary) if |u'|. M are sufficiently small These func¬ 
tions define a conservative, integrable transformation T'. 

Furthermore, T will be of the same type I', II', II" or III' as T, except 
possibly that when T is of type II", T may be of types I', II' or III'. For 
example, if T is of type I' then F* has an initial term tivlog? (see (8)) of the 
second degree. Hence F' begins with terms of at most the second degree. But 
the initial terms cannot be of the first degree because of the relation F* - 'p(F'). 
Hence we have F — cuv + •••, and, by introducing a constant factor in F', we may 
take c “ log 7. An inspection of the initial terms of the transformation T shows 

then that u', =» qu' + •••, v\ — ^ t/ + •••, as desired. An entirely similar argument 

holds in the cases II', III'. 

In all cases it is dear that either we can take the initial terms of F* to 
coincide exactly with those of F, or these terms are of higher degree in F* 

than in F. In the first case = 1 for F — 0, while in the second case 



Surface transfonnalions and tbeir dynamical applications. 


41 


Consider the formal series 

t»'), v'), 

where k' = k^~ l . It is necessary to elaborate further what is meant. 
a t 

Take case I' for example. Here «'*, «»'* are power series in v!, v' with coef¬ 
ficients polynomial in p*. p~\ k. Since 

&-x+•*• + -. 

we have 

p*' - p*(x + aF'k log p + •••)• 

That is, p*' can be written as p* multiplied by a power series in u', v' with co¬ 
efficients polynomial in k. A similar remark is true of p-*' and k\ When these 
series are substituted in r'), «/*(«', t>'), and the finite number of terms of 

any particular degree in u', t/ are collected, new power series in u', v' with 
coefficients polynomial in p*. p~\ k are formed. It is these series which we 
designate by «'*•(**', v'), e'*<(*', v'). 

Similarly in all of the other cases the new series t/** are of the same 

form as the series for u*. v*. 

Now we have evidently 

d n'k< du'k • dtp 
dk " di dF 1 

by a rule of formal differentiation which evidently applies to each constituent 
element of u'*. and thus to the entire series. A similar result holds for v\>. 
Making use of these results, and also of the defining differential equations for 
u’k>, v\> we find 

Q* Uk ‘ -V ^ 

V dk V dk “ On'k-dF 1 * 

where the arguments in Q, F' are understood to be u'*., But, from the 

relation F*~tp(F'), it is clear that these differential equations for are 

the same as those for t/*, t>*. Also these two pairs of functions reduce to u', «/ 
and u, v respectively for A; —o. 

Since such formal differential equations and conditions determine a unique 
power series in u, v with coefficients functions of k of the stated type, we obtain 
Acta mathtma/ica. «a. Inprimt 19 air* IWO. 6 


151 



42 


George D. Birkboff. 


the formal identities 

«'*(*. v) = u k (u, v ), vV(«, w) = r*(w, t>). 

In particular, the above relation holds for k = i and gives 

v) ~u,(u, «>), rV(w. t») - v,(u, v). 


where now k —• • 

The noteworthy feature of these equations is that the only possible diver¬ 
gent element appearing is k. 

Now write k — + *"• Then «iV(u. t>), t/*(u, v) become convergent 

power series in u, v, k" for sufficiently small values of these variables. A formal 
power series in u, v without constant term satisfying the two equations above 

is V'— ' Since tl ,es © equations are of the ordinary analytic type, 

k" is a convergent power series. Consequently pP ' 8 a convergent series, 
and, since F is also, it follows that is a convergent power series in z. 


Finally then tp is a convergent series. 

It follows that F* is given by a convergent series in the integrable case 
and thus, by the differential equations (9), that the series u k , »•* are convergent. 

The simplicity of the integrable case is sufficiently evident from the fol¬ 
lowing fact: 

In the integrable ease ezplicit formulas for u k , v k are at hand, namely 


(21) F*(ii k , v k ) — F*(m, v), k — I — — j » 

0 v M ' r ^TT 


where the integrals are taken along the curve F* = const. 

The normal forms (x6), (x<>’) and (xq), (19’), for cases T and irrespectively, 
are integrable. If these normal forms can be obtained by means of a change 
of variables (15) in which the series U, V arc convergent, the given trans¬ 
formation T is integrable. 

Conversely, suppose T to be integrable and of type I'. The series F* 
converges and by (8) can be written U Y log *», where U, V are convergent series 


152 



Surface transformations and their dynamical applications. 


43 


of the form (15). If we introduce these new variables, which we call u, v for 
brevity, then uv is an invariant function. 

For the integrable transformation T in these variables, the convergent 
series 6 u, 6 v must be of the forms up log g and — r/>logp, where p is a con¬ 
vergent series with constant term 1. In fact vdu + udv vanishes and the initial 
terms of 6 u and dv are u log p and — r log q respectively. 

If a further actual change of variables (15) can be made which gives T 
the form 

V t -jV<r*w, 

then an additional actual change of variables as in § 21 yields the desired 
normal form. But U lt V, have this form if and only if 

AU - I/(log Q + h{U V)), A V - - V (log q + h{U V)), 

i. e. if 

(“flT ” v ^) v ,og Q ~ U + h (UV)), 

(«£“*'- j V log C— V(log Q + h(U V)). 

We have then to find convergent series U, V, h which satisfy this pair of equa¬ 
tions, in order to establish the proposition under consideration. 

It is sufficient to satisfy the equations 

/ dU *U\ , 

\ u d'u~ v lhil plog(fmm + 

(“c7Ti‘ ~ v Tv) p ,o ee““ mog*+ »/»(«**)). 

with convergent series U, K, #/>, provided that U and F have initial terms u and 
y respectively. For, multiplying the first equation by V, the second by U, 
adding and integrating, we conclude that UV is a function of the product uv 
alone. Hence we have U V — uv + •••. Therefore uv can be expressed inversely 
as a power series in UV, and rp(uv) — h(UV) where h is convergent. 

But, by the same equations, U and V contain no terms in v and u alone 
respectively, since p has a constant term 1. Consequently we may write 


153 . 



44 


George D. Birklioff. 

V — ver N , 


where 3 / and N are convergent power series in u, v without constant terms. 
The equations above take the form 

031 


„ 11 3/ i / , 

OV “pi log Q I 

L(, + «., 

du dv p\ log pi 


du 

ON 


If convergent power series solutions 3 /, A T and <y> without constant terms can 
be found our proof will be complete. 

We observe in the first place that there are no terms in u, v with equal 
powers of u and v on the left. Hence, for any conceivable solution, the series 
development of 

i(»®- 


contains no similar terms. But this property of a series is not modified if it is 
multiplied by a series in uv only but having a constant term. Hence 

l _let?_ 

V log ? + 7»(uw) 

is a similar series. The second term must consist precisely of those terms p'(uv) 

in uv alone found in - . and we have 

V 

log?- 


Thus tho only possible formal series <p is convergent. 

When this particular <p is substituted, the right-hand members above be¬ 
come power series in u, v without terms having equal exponents. Write then 

09 

3 / - 2 V = 2 PmnU- 1 ", (m h n) . 

m+N—1 


The equations for the formal determination of the coefficients P mn show 
that these are uniquely determined and not greater numerically than the cor¬ 
responding coefficients in the right-hand members. Thus the desired convergent 
solution 31 1 N, tp is obtained. 


.154 



Surface transformations oud their dynamical applications. 


45 


An entirely similar discussion can be made in caso II*. 

In the integrable cases /', IT, and then only, the normal forms (lO), (i0') and 
(19). (19') can be obtained by a change of variables (15), where V, V are convergent 
series. 


§ 26. The non-integrable case and the integrable case. 

Let two transformations T and T' be said to osculate to the nth order if 
u, — u', and t>, — t/, are given by series beginning with terms of at least the 
(ft + x)th degree; T k and T k will also osculate to the nth order for any integral 
value of k. It is clear that the formal series for du, 6 v and du', dv' agree to 
terms of tho (it + i)tb degree. Conversely, if du. dv and du', dv' agree out to 
terms of the (.11 + i)tb degree, then T and T osculate to the /<th order. 

Let T be a given conservative transformation of types I', II', II" or III' 
with a quasi-invariant function Q and a formally invariant series F*. If Q 
and F* are convergent series agreeing with Q and F* to terms of the (it + i)th and 
(ft + 2 )th degrees respectively, then there will exist a corresponding integrable trans¬ 
formation T with a quasi invariant function Q and an invariant function F *. which 
T osculates to the tith order. 

The transformation T is evidently that defined by the equations 

o dv * *** 

V dk dVk Q dk dii k ’ 

with initial conditions u 9 —> u, v # — v. 


Chapter II. Hyperbolic invariant points. 

§ 27. The analytic inrariant curves in case I'. 

In the non-integrable as well as in the integrable case I' the two real 
formally given invariant curves correspond to actual curves. A proof of this 
fact was first given by Pomcani (loc. cit.) and later by Hadamard.' Our 
proof will be of a different character, and involves the hypothesis that T is 
conservative. A similar method will be used later by us in treating more general 


' Sur literation et let solutions 
ciiti Mathimatiquc de France, vol. 29, 


asymptotiqurs det equations differentielles. Bulletin 
1901. 


de la So- 


155 



46 


George D. Birkhoff. 


The two formally invariant curves in case I' may either be obtained from 
the equation F* — o or from the equations U = o, V «= o where U , V are the 
normalizing variables of § 21. In fact, when the transformation is in the normal 
form, these equations yield the formally invariant curves. 

In case I' the two real formally invariant cvrves give two analytic invariant 
curves through the invariant point. 

We commence our proof by choosiDg variables which osculate the normal¬ 
izing variables U, V to the /ith order (11 >2). According to § 21 we have then 

u, -pus*" 7 * 1 + w(u, v), v, — lve- eu,p/ + rj(u, v). 

where w, »; are convergent power series beginning with terms of the (/1 + i)th 
degree or of higher degree. 

Our proof will consist of the following three steps: 

(1) the limits lira*-, u k (Q~ k t, o), lim 4 _. v*(p-*f, o) exist as formal power series 

u*(t), v*(t), and yield a formally invariant curve; 

(2) u*(p-*/,o), Vk(Q-*t,o) are dominated by fixed convergent power series in t 

for all k; 

(3) and hence these series converge uniformly to limiting functions of t for |i| 

sufficiently small, namely to u*. w* respectively, which are thus the co¬ 
ordinates of the invariant curve F — o. 

A similar treatment of the invariant curve £/ —o can be based on the in¬ 
verse transformation T—\. 

Proof of (1). 

We have directly 

ti* - p*ue* uV + w(«, v, k), v k — g- k ve- kJv ‘ 4 y(u, v, k), 

where to(u. v, k), r t (u, v, k) are convergent power series beginning with terms of 
the (11 + x)th or higher degree. Furthermore, the first term on the right-hand 
side of either equation has evidently the property that when expanded in power 
series in u, v the coefficient of a term of the mth degree is linear in p* or p“* 
and polynomial in k of degree less than the degree of the term. But u*, v k 
have the property that the coefficient of u m v n is a polynomial in p*, p"\ k of 
degree at most m + n (§ 4). Therefore the samo is true of to (ti, v, k), »j(u, v, k). 
Thus we obtain at once 

oe oe 

t**(p - k t, o) - t -I *)<". v*(p-*f, o) = 

»•*+! 


156 



Surface transformations and their dynamical applications. 


47 


where p {n) , 9 < " , are polynomials in g~ k , k, and where every term involving k is 
affected with a multiplier o~* raised to a positive power. 

It is thus seen that, inasmuch as g> i, each coefficient of the series for 
Uk (g~ k l, o) and v k (g~ k /, o) approaches a limiting value as k becomes infinite. In 
other words there exist limiting formal series u*. v*. Clearly n* is given by a 
series with first term t and following terms of degree at least u + i, while v * 
begins with a term of degree not less than u + i. 

Now we have the formal identities symbolized by 

T(v k (g~ k t, o), r*(?”*/, o)) - o). v* + i o)). 


where <' stands for gt. By allowing k to become infinite we obtain the identi¬ 
ties symbolized by 

T(u m (i), V(i))-(u*(ee), a*( e o). 


This is precisely the condition that u — u*(t), v — v*(f) be a formally invariant 
curve under T . The parameter / on the curve goes over into pf. 

Proof of ( 2 ). 

To establish (2) we begin by observing that, inasmuch as u k (u, v), v *(u, v) 
arc computed by successive substitutions, every coefficient in these series will 
certainly become positive, and as large numerically as it is originally, if u,,v, 
are replaced by any series in u, v with each coefficient positive or zero and as 
large numerically as the like coefficient in »# |t r,. Such series can be taken of 
the form 




K{u + t>)« 

1 — Hu + V) ' 


gv + 


K(u +9? 

1 — L{u + v ) 


provided that K and L are sufficiently largo positive constants. 1 

Hence when we compute u k (u, v), v k (u, v) for u —p-*f, t> —o, taking »/,, v, 
to be these modified series, we obtain power series in t which have positive 
coefficients greater than originally and so dominate the earlier series. Again 
these series will certainly be dominated by the sum 

Vk{q~ k t, o) + v k (g~ k t, o), 

where v, are the dominating series exhibited above. 


1 The linear terms taken are dearly large enough. The coefficients of n«r» <m + » > 2 ) 
in either series is at least as large as A'/."*"-*, which evidently exceeds numerically the 
coefficient of in or r, if A*. L l* chosen sufficiently large to begin with. 


157 



George D. Birkhoff. 


48 


The sum a k — u* + r* obeys the law of formation 

Ko\ 


with tr, — a = u + v. 

But the sequence a* is itself obtained by the method of successive substi¬ 
tution. Therefore the dominating series a* is increased if we take 



provided we take if to be positive and as large as — and L. 

Under these circumstances we get the following general formula for a*: 


Ok— - 


-«?=f 


The corresponding series in t is then 


e— x 


which is dominated for all k by the convergent series 

t 

— m- 
1 -: 


Hence this same series dominates the original series «/*(?”*<, o), t>* (?“*/, o) for all 
positive integral values of k. Q. E. D. 

Proof of (3). 

With the aid of (1) and (2), established above, we can at once show that 
the power scries o), Vk(e~ k t, o) must be approaching a limiting pair of 

functions uniformly for 1 1 1 sufficiently small. 

To this end we choose k so large that all of the coefficients up to the in th 
in both series (m arbitrary) differ by less than an arbitrarily assigned positive c 
from their limiting values, which exist by our result (x). The sum of these m 
terms never varies for greater k by more than a fixed t 1 if |f| be restricted. 


158 



Surface transformations and their dynamical applications. 


49 

But the remaining terms cannot exceed the sum of the corresponding terms 
of the fixed dominating series with t replaced by |f|. Hence the sum of these 
terms is arbitrarily small if m is sufficiently large. 

The fact of uniform convergence is evident. 

Thus «•(/), v*(t) are not only formal series but these series converge to 
actual analytic functions which we denote by u*(/), v*(t). Consequently we have 
an analytic invariant curve u — u*(t), v =■ Since u*(t) begins with a term 

t while v*(t) begins with a term of degree at least /« + i, this invariant curve 
has contact of order fi at least with the u-axis at the invariant point, and 
corresponds to the invariant curve F — o. The parameter change under T 
along the curve is t, — qt. 

Evidently the existence of these two analytic invariant curves C/ —o, F — o 
establishes the fact that the invariant point is unstable in case I'. 


§ 28. A general property In case I'. 

Introduce new variables of the type (15) 

C/-u —»/»(«), F —e — </'(«), 

where u - <p(v) and v — ip(u) are the analytic invariant curves of the preceding 
section. In the U V -plane the invariant curves ore the axes. For the sake of 
brevity of notation we will let u, v denote any set of variables which make the 
axes and the invariant curves coincide. 

When such variables liave been selected it is clear that u, - o if u - o, 
and that v, — o if t»—>o. Hence we have 

+ •••), v, + ...J. 

From this form of T we infer at once 

*-.<5 < f+ .. f—<*<i + . 

for points near (o. o). Here « is an arbitrarily small positive quantity Thus 
u increases numerically and e decreases numerically upon iteration of T. in 
such wise that the following result is obvious. 

It the invariant curve a arc taken as the axes in case V b,j means of a prelim- 
tnary chc.ce of variables <i 5 ), every point of the region „» + „’<,)• „ 0 

Ada malhtmalitm. «3. laprim. I. *. 1W0 “ 


159 



50 


George D. Birkhoff. 


carried out of the region by iteration of T, while every point v * o it carried out of 
the region by iteration of T_,. The excluded point* of the axes approach (o, o) 
under the same conditions. 

These considerations show that there can exist no further invariant curves 
through (o, o) besides the two analytic curves above obtained. 


§ 29. On the invariant series in case T. 

The treatment of invariant points in case I' as given above is sufficient for 
the later parts of the paper. Nevertheless, there remains open the question of 
the actual existence of divergent formal series F* in case I\ Unfortunately I 
have not been able to answer this question. In the present paragraph upper 
limits for the coefficients of a particular invariant formal series are obtained. 

If the invariant curves are taken as the axes in case /' by means of a prelim¬ 
inary choice of variables (15) and if F denotes the invariant formal series having no 
terms with equal exponents in u, v save up log q, then the coefficient of u m v n in F 
does not numerically exceed <W". where C mn >o is the coefficient of t*"v» in a 
convergent power series. 

Before entering upon the proof of this statement, we note that in a series 
F with coefficients *0 restricted the terms in any power of u or of v form a converg¬ 
ent series. 

There exists such an invariant series F, for, by forming y>(F*) as a power 
series in F* beginning with a term F*, we can eliminate the terras with equal 
exponents in F. Since the most general invariant-series is a power series in F* 
in case I', it follows that this particular series F is uniquely determined by the 
given condition. 

In order to effect a proof of the italicized statement we first write the 
equation F(u t , v,) —’ F(u, v) in the form 

obtained by replacing u. v by qv respectively. We hove 

“ M 7>. *•£» **)“*'*• 


160 



Surface transformations and their dynamical applications. 


61 


where p and q are convergent power series in u and v with constant terra i. 
The equation above may be written 

F (j> C”) “ F(up, vq). 

Likewise from the equation / , (t/_i, v_i) — F(u, v) we obtain an equation 

= F ( ur > vs )> 

where r and 3 are convergent power series in u, v with constant terra i. 

If F mn denotes the coefficient of in F so that F„ — log p, F i0 — F oi o, 
there results, by a comparison of coefficients in these two equations, 

Fm >i (p n_ m i) - P mn , F mm {tr—— I)- Qm n. 

where P mn , Qmn are linear homogeneous expressions in F afl with a<m, (i<n, 
a + ft <m + n. The coefficients of F af in P mn are polynomials in the coefficients 
of the series p, q with positive integral coefficients, while the coefficients of F a p 
in Qmn are similar polynomials in the coefficients of the scries r, s. Combining 
the above equations we obtain 


+ e~——2) - p„.„ + Q m „. 

For m* n the coefficient of F mn is positive. 

Suppose that p, q, r, s arc replaced by a single dominating series, say 

I 

z — A(u+ v) 

Then P mH , Qmn takes a common form and the modified equations 

F mnlr- m +r—- 2)-2P mn , (m^n), 

define new positive quantities F mn for m^n, at least as large as before in 
absolute magnitude. 

Along with these equations we consider the equations 

— x) - (m + 3), 


161 



52 


George D. Dirkhoff. 


in which the arguments F in l< mn are replaced by G. These equations determine 
G mn for m + n = 3, m + n = 4, ..in succession, provided that we take G, , — log q, 
G^ — C 7 0l = o. These differ from the equations determining the modified values 

F, nn only in that the divisors ?- + #*— — 2) ore replaced by the larger 

divisors p m+ ’‘— 1. Consequently wo have 

Fmn <U **'*"'-* 

7(0"—' + f—■ -*) 

where the values m\ n' written ore for all the divisors explicitly entering into 
some one term of the complete expression for F mn . 

Now take these divisors in order beginning with m' — m, n'— n. The next 
divisor has m'<m, n’<n, m' + n'<m + n, and in general m\ n' do not increase 
while m + n' decreases by at least unity at each stage. 

For m' > n' we have 


r t:’- 1 _ 


and there is a symmetrical inequality which holds for n' > m\ Let us replace 
the factors above by these larger factors. If m>n a superior limit for the 
product II is therefore obtained by making m' diminish by unity successively 
and keeping n'— n until we have m'— n + x, and thereafter decreasing n' and m' 
alternately by x. Hence the product of the factors is less than 


and thus less than 


Thus we have 










2»*V "- 

(-r~‘ 


1 f '-> «■« 


for m>n, with the same inequality bolding for n> m. 


162 



Surface transformations and their dynamical applications. 


53 


It is clear therefore that |F mtI | is restricted by an inequality of the type 
stated if the series 

G 2 G mn 

converges. 

The coefficients G mn and G mm are equal so that G mH + B mn is the coefficient 
of in either 

G (i — A{u + vj’ i — A(u + v)) or G [i — A [u + v) ’ x — A(u + vj)’ 

Moreover, it follows also from the equations of definition of G, nn that the 
difference 

0( e u. tv) - - [o( x _ A („ + v) . J _ Alu + + o [~_ ^, u + A (., + V))]' 

considered as a formal series, has no terms in u m v n for >n + n > 3 , and so 
reduces to 

((••— i)uv log p. 

Furthermore, G is determined formally by this property. 

If we replace this last difference by 

i(f*-1) (« + »)*. 


which dominates it, a modified O series is obtained, satisfying the equation 
OU/u.gv)- ? [o(—. 1 _ 1(1( + „,) + 0(— ^ + + 


+ - (e s — i)(w h *)*, 


and certainly dominating the former (J series. This functional equation wholly 
determines the new series. 

But the functional equation 


'(.s, = 


163 



54 


George D. Hirkboff. 


admits of an analytic solution, namely 



■2 


*« 




It follows that 6'(«, v) = /(»i + ») gives the solution of the modified equation 
for G, and thus that the original G series converges. Consequently the proposi¬ 
tion under consideration is fully established. 


§ 30. The case I". 

This case is easily disposed of inasmuch as T, is of the type T treated 
above. 

If we choose formal normalizing variables, 7 \ becomes precisely 







This is possible by § 21. 

We have — uv. In case u t v,*uv, write 

«, V,*=UV + t />(«, *) + ••• 

where »/> is a homogeneous polynomial in u, v of least the third degree in v, t\ 
This gives 

ti, v t — u, u, + 7>(ti|, t;,) + • •• = uv + v>(u, v) + ^ vj + •••. 

Hence ip(u t v) vanishes identically. This is impossible for any poly¬ 

nomial not identically o since pH-i. Thus we conclude that u,v.-uv, and 
accordingly that ti, and v t are divisible by u and v respectively. 

We may now write 

Here g(u, v) is a power series in u,v with initial term 1. 


164 



Surface transformations and their dynamical applications. 


55 


The first of these equations gives u, = g* ug(u , v) g(u, , ®,) whence, by com¬ 
parison with the normal form, 

afr. v) 

Replacing u, v by m,,®, in this equation, we have 

®»)-«° ,V 

so that, by a comparison, g(u„ ®,) — g(u, v). i. e. g(u, v) is an invariant function 

c / / 

under T t , and must be a function of the product uv only (§ 14), namely e*“ " • 
The form of the transformation T is now fully determined. 

In the cast I" by the aid of a formal change of variables (15) the transforma¬ 
tion T may be reduced to the form 

V l - Q U* eut °* t K 1 -ipri*" v i 

* 

where we may have c = o, / — 00 as in case I*. 

This same reduction shows that the formally invariant curves under T and 
T t coincide. 

In case I" there is a formally invariant function F* and two analytic invariant 
curves through the invariant point, these being the same as for T,. 

Wo can at once infer that the same property holds in case I" as is given in 
case I' by the italicized statement of § 28. Hence there arc no further invariant 
curves through (o, o). 

In tho integrate case these normal forms can be obtained by means of 
ordinary changes of variables (§ 25). 


§ 31. An example in the hyperbolic case II". 

There are of course no real invariant formal curves in case II' inasmuch 

as F» is of the form - ' «(„• + »*) by (, 3 ). Thus the case I' treated above may 

be regarded as the general hyperbolic case, while II' is the general elliptic case. 
The cases II", III' may be either hyperbolic or elliptic. 

In the hyperbolic case II" we can set up an example showing that the 
formal senes F* may diverge and also illustrating other significant features. 


165 



56 


George D. BirkliofT. 


The transformation T is the following 

u , = . v t = (i + ti)*(t> + t<«). 

This i9 evidently of type II" at the invariant point (o, o). Moreover, since the 
Jacobian of v, as to u, v is i, areas ere preserved, and T is conservative 
with quasi-invariant function Q*=i. 

By direct iteration we find 

"* - rnz ■ °‘- (1+ ‘“‘'I 0 + “* (■ + (iTio* + + Ci+d-Dur 

for all . integral values of k. The expression for u k is of the type given in § 5. 

To express v k in such a form we introduce the well-known function 1 p(z) — i*). 1 

d z 

We have, by means of the functional equation for r(z) 

+ #>(*), 

so that 

• v"(z+i)-^ + -rw, 


where ip'” stands for the third derivative of ip as to z. Hence we have 





(x + ti ) 4 



* 


whonce, by addition. 


Thus we may write for positive integral values of *, and likewise for k a 
negative integer or o, 

1 For a simple development of the properties of f[s) used hero, see K. P. Wii.liams, The 
asymptotic form of the function fix). Bulletin of the American Mathematical Society, vol. 19, 1912 


166 



Surface transformations and their dynamical applications. 


57 


Now tp(z) is given asymptotically by the series 


<°g *-£+2 


n-| 


(-i )"B m 
2 nz tn 


in the right half of the complex z-plane, in which B n denotes the nth Bernoullii 
number. By differentiating three times* we infer that «/'"'(*) is given by an 
asymptotic aeries, 

2 A 3 v (-i) n (2n+i)(2n + 2 ) 5 , 

Z s Z 4 ^ 
n -1 


in the right half of the complox z-plane. This series satisfies formally the 
functional equation for given above, although the series diverges of course. 

If replace */T by this divergent series in the expression for v k found above, 
pm beeoraes a power series in u, v with coefficients polynomial in k. Moreover, 
for k - i, 2, 3. • • • the series involved must converge. In fact t;* is then a function 
analytic at u - o, and therefore its asymptotic series is its power series expansion. 
Hence we have here the unique formal series for v k of the typo considered 
in § 5 - 

By direct formal differentiation of these series for u*. v k and setting k -o, 
we obtain 



dp 


2UP — 


_x_ 

6 H 



Moreover, since Q -1, we have for the invariant series F * by formula (ii) 


OF* 


OF* 


»V Tir - 2U,,+ 6^iHi)' 


whence we get immediately 




This of course is a divergent formal power series in u, ». But it was seen 
in § 25 that the series for F * converges in the integrable case. 


Ada rtuUhimaitca. «3. Imprln* I* » m*n iwo 


167 



58 


George D. Birklioff. 


Thus T is of non-inlegrable type. 

We note the very significant fact that if be regarded as a function and 
not as a formal series, we have here an actual invariant function, real and 
analytic for u>o, and asymptotically given by the formal series for F * for u 
small, so that this function is continuous together with all of its derivatives 
for u > o. 

At first sight this seems to leave available no similar function for u<o. 
Such a function is readily furnished as follows. The function «/'(i— *) satisfies 
the same functional equation as if'(z), is analytic in the left half of the complex 
z-plane, and is given asymptotically by the same divergent series as «/'• Hence 
n similar invariant function F * can be obtained by replacing the function *f'"{z) 
by the function 1 —*)• 

If the invariant functions for the two halves of the uvplane are united wc 
obtain a real invariant function, analytic save for u —o, continuous together 
with its partial derivatives, and given asymptotically by F* at (0, 0). 

It is this general type of invariant function which probably exists in all 
cases. 

In the case F, I can prove the existence of a real invariant function, 
continuous with all of its derivatives and asymptotically given by F* at (o, 0). 
But this discussion is omitted since the existence of an analytic invariant function 
is highly probable in the general hyperbolic case. 

We come now to the question of formally invariant curves. These are 

obtained by factorization of F* (§ 19). Since •/'"'(^) has an initial term 2u\ the 
curves are at once seen to be the curve u—o taken doubly, and the formal 
curve r —-— «W-M. T, ' c l a ^ tcr curve also gives an invariant curve, analytic 

01* *M» 

for uho such that c is continuous together with its derivatives of all orders in 
u nt u — o. Half of this curve (u>o) is that arising when by we under¬ 
stand the function */•(*) introduced above. For u <^o wc get of course the other 

half of the invariant curve c — -)• 

The analytic invariant curve u* — o corresponds to a multiple factor of F*, 
and accordingly is an invariant point curve (§ 20). This is readily verified from 
the explicit formulas for u,, c,. 

It is aho worthy of note that, for any point (u, r) not on either invariant 
curve and with u>o, v* increases for lc~ —1, — 2,.... Also for & —x, 2,..., 
wc note that 


168 





Surface trausformalious and their dynamical applications. 


59 


F *—“ , ['* + 6ii' r '(i)]- e "o- 

Hence as u k diminishes the expression in brackets increases. In other words 
any such point (v, v) leaves the vicinity of (o. o). both upon indefinite iteration 
of T-x and of T. A similar argument shows that the same is true for a point 
(u, o) with u < o. 


§ 32. Preliminary normalization in the hyperbolie ease II". 

We will consider first what may be termed the non-speeialized case II". 
The characteristic feature of the case II" is that the invariant formal series for 
F- begins with terms of higher than the second degree. In general then we have 

where tho roots of the cubic in s. 


■»* 


+ + <* 0 , 6 * — O, 


are distinct and at least ore of these roots is real. To each such real root , 
corresponds a real formally invariant curve r _ ,u + .... In general therefore 
case II" is of hyperbolic type. Wc consider any such real formally invariant 
curve C. 

By a linear change of variables C can be taken tangent to a new u-oxis. 
That is, wo can make , - o in this way. The equation of C is now of the form 

rr” WhCr ° % a father formal change of variables (.5), 

UmmU ’ F-* — the formal curve C can bo taken into the tf-axis. It is to 
be observed that any such linear change of variables as well as a change of 
variables (, s ) leaves T of the same type II". Since V _ o is the equation of tho 
invariant curve, both V, and F* are divisible by V. 

If we do not make the above formal change of variables but make an 
actual change to variables which arc the same to arbitrarily high degree „ + , 

bv 17 ~ , t0 t ? r “ S ° f degreC " + *•«“> “f rfogree and thus, 

by (.1), F ,s unaltered to terms of degree ,, + Consequently we may write 

= r[i + .. ] f w (n, r), 

F*(u, r) - c[a n u* + a„ up + c* + ...] + t; („ , o) , 


169 



60 


George D. Birkhoff. 


where the brackets are polynomials in u, v of degree at most ft — x and where 
to, rj are power series with initial terms of degree at least ft + x. 

Under our hypotheses o„ is not o, for in that case a —o would be a double 
root of the equation in a. 

By direct iteration of the formal series in their original form (§ 2) we find 

— u + klfpnu* + tp^uc + (p„v') + •••, 

Vk - v + k(tp M u 9 + tp„up + ip n P*) + •• - 

Hence we find, on differentiating as to k and taking k = o, 

du -Ti.w* + Pu«v + tp n v* + •••, 
do — + ip n uo + 

Using the equations (ix) and bearing in mind the fact that Q commences with 
a constant term x, we see that 

Vf “ a n» Vn “ 2 a n> *Pn “ 3 °oi. 

‘/'w- o, •/»„- —20,,, -a,„ 

These results may be summarized as follows: 

Let T be a conservative transformation of type II" for which there are three 
formally invariant curves with ordinary points and distinct tangents at (0,0). If 
variables u, v are properly chosen, any real formal curve of this sort can be made 
to osculate the v-axis to any order /i> 2 . Under these circumstances we have 

u t — u + a„u a + 2a lt uo + •••, (Oii^o), 

*»• — ®[x — *o u u — a tl v + '"] + »(«, 0), 
f(u, v) = cCaj.u* -t a, t uo + a c ,p* + •••] + i?(u, v), 

where the bracketed expressions are polynomials ol degree at most ft — 1 and to, rj 
are power series with initial terms of degree at least ft + x . 

§ 33. Some inequalities In the hyperbolic case II". 

Let us take variables u, 0 as above. We will take /«> 5 and also a„>o. 
Furthermore u, v are taken to be complex, of small moduli, and such that 


170 



Surface transformations and their dynamical applications. 61 

(22) 'Jf(u)>o, 

where 9 f(u) designates the Veal part of u\ 

The series for t/, gives us at once 

(23) | ,1. — « — a. t 11*| < E <» | u I s , 

where E »> is a definite positive constant. If we introduce a new variable 
* - l ■ («) e»n be given the essentially equivalent but more convenient form 

(23') l*.-* + a ; ,|<£<»|z|_ 1 . 

Suppose now that we take r-o. K a large positive quantity; in 

this case the inequalities (22) are satisfied. By iteration of T we obtain (2,, v,), 

<*>■ ">). Let us assume for the present that 9i(r,)> /(, |, v | < |*,for 

I — 0,1.n —i with n>0. 

From the inequalities (23') for 2. 2,. 2,. .... we infer 

1 2 / — *»-, +a, l \<E°'R-<, (/_ i, 2 . n) . 

These inequalities show that the real part of 2, diminishes by approximately a„ 
as l increases by 1, while the imaginary component varies slowly. By combi- 
nation we obtain 

\zi-zj + | (<> <;</). 

and thence 

l*>l>l*/ + (/ — i)a u \ — (/ — 

But, since z, has a positive real part, we have 

In + (f i)a„ [> (f — j)a„. 

whence 


'-?• < a Mn+ a„|. 

Replacing /-/> o by this greater value in the negative term of the inequalitv 
tor | Zj I we find 


«*-*•*■■ 




62 


George D. Birkhoff. 


The polynomial of degree u, F = F* — r t . has the same terms as those of 
the formal series F* out to terms of degree u + i. Consequently if (22) holds 
we have 

|F,— 

Thus, under the above hypotheses, we have 

| F, - F,- t | < 1 2/-, I"'- 1 . ~ 1,2 .n). 

Moreover /—o since F is divisible by v. By combination we therefore obtain 

!/■,!< 

i-0 


Using the preceding inequality for |*>| we find 


1F, 1 <£«■ 2 1 I - ' -1 

i-0 

« 

But this sum is less than 



dn 1 T dt 


where we have written n — 12 / 1 <. The final integral which appears has evidently 
as greatest value 

f __ *_ 

.1 W — , + ta, x !"♦' 

0 

inasmuch as 21 has a positive real part. Thus finally we obtain 

(24) |f/|<£* s, |*i|-‘\ (J-X, 2. .... n), 

where £ (S > is a definite positive constant which does not increase as R increases. 
Furthermore, from the explicit form of F we have 

l*\l I*. I*>£ W, M 


172 



Surface transformations and their dynamical applications. 


63 


so long as (22) holds for (z. r). Thus we obtain 

Combining this inequality with (24) there results 




(/— I. 2. T)). 


(I — x, 2.n). 


Since ft — 2 >3 the second inequality (22) continues to hold until 9 ?(z/) <R. 
Our main result may be formulated as follows: 

If !• > 5 . a :i > o, 01 (^) > R > °» an( I if r — o, then ue have 

(25) |r,|< <|| /# |« 

for l — 1, 2. until 01 

It is evident that Oil -) ultimately becomes less than R. 


§ 34. Further inequalities in the hyperbolic case II". 

The inequalities of § 33 are not sufficient for our purposes. It is necessary 
to evaluate u, more precisely than we have done. 

To this end we write 

W ~n + u ,og "+/*«+••• + xu 1 , 

where * is arbitrarily large. Also let « stand for the scries formed by the terms 
in u, which involve u only. We propose to determine real quantities a, ft x 
so that 

I *•(«) — u>(M)-ra :i |<tf|u|**». 

This condition will be met'if 

[u (I “ x ) + a »] + lo 8 £ + f*(* — »<) + ••• + xfti* — u*) 


173 



64 


George D. Birkhoff. 


is of the (k + i)th order in u. The term in brackets is a convergent power 
series in u without constant term. The following term is a similar series begin¬ 
ning with a linear term aa,,u; hence a can be so chosen that the first two 
terms form a power series without constant or first degree term. The third 
term is a similar series with leading term 2/Ja 21 u*; hence p can be so chosen 
that the first three terms form a power series beginning with terms of the 
third degree or higher. Continuing in this way we arrive at a determination of 
a, p, ..., x which yields an expression to with the desired property. 

Suppose now that we introduce the variable to instead of the similar var¬ 
iable z — ^ (§ 33), taking 9 ? (^J > R and choosing the principal value of log n 

in the expression to. It is clear that 9 t(u>) is large when 9 ?(2) is large and that 
the region 9 t(z) > R corresponds to a region of similar character in the to-plane, 
with nearly vertical tangent throughout and crossing the to-axis far to the right 
of the .origin in the 10-plane. Hence, by Darbodx's well-known theorem, the 
correspondence between these regions in the 10-plane and z-plane is one-to-one 

and conformal. Moreover, in this part of the to-plane ^ * 8 nearly x. 

From the definition of ti it appears that 

and thence from the explicit expression for F , 

|ti, — |tz|“* 

when (22) holds. Further, we have 

I*!-*, 

when (25) holds, from which 

| «*(*,)-!*(«) | <& W I UK 

If we recall the defining property of to(u) and take k<n — 5, we get 
finally 

|«o(v 1 )-to(u) + a, 1 |<^'>|«r*. 

Applying this inequality successively for the sequence of values (v, 0) of 
§ 34 (when (22), (25) hold), we obtain 


174 



Surface transformations and their dynamical applications. 05 

| to, — to + a tl | < EM\ W I - * -1 , 

| w 2 — u>, + a u | < EM | w, |-*-«. 

I w n — *<•„_, + a„ | < EM | w n -i I - * -1 . 

Thus in the complex u>-plane each point w, w lt w n in the region 
R(w) > R' falls approximately at a distance a„ to the left of its predecessor. 
By the method used in § 33 it is apparent that the sum 



and | wi—tv + la u \ are of the order Hence we have: 

Assume ,« > 5, k < u -5, B „ > o. g) > R, and 0-0. Write 

*/;-£ +a log u + /*u + ... + xu*, 

where a, x are suitably determined constants. Then we have 

( 26 ) |«f/ —u> + la u I < EM | w _ ia u |-a 

lor l — 1, 2, .. until 'J? |i.J < R. 

§ 3 5- The Invariant curves In the hyperbolic case II". 

With the facts deduced in §§ 33. 34 in mind, we can readily prove the 
existence of an invariant curve. 

Let us take u>, v as our variables where w is restricted to the region of 
the u»-plane which corresponds to |lj > R. 

Consider the two sequences of functions of w: 

tv, w , (w + o„. o). ur, (w + 2«„ , o). .. 
o, t>, (10 + a„ , o), e,(u/ + 2 a ix , o), .... 

According to the inequalities (26), (25), we have 

Ada mat hemal tea. 13. Ib P Hb 4 lo 77 man \970 


175 




George D. Birkhoff. 


| xoi {w + fa„, o) —1*| < £ (, 3 , |u>|-*, 
|r,(u» + /a, M o)| <E"« \v\r***. 


inasmuch as — approaches x. 

Thus, for 9 t(u>) sufficiently large and positive, the sequences wi,vi remain 
bounded and define a closed set 2 of limiting functions w*(u>), v*(w) analytic 
within the same to domain, 1 and restricted by the inequalities 

| u>*( w) -to\< E"*> | it |-*, | | < EM | to 

Now we have 

T(tvi(w + Za„, o), f/(u> + la u , o)) « (ui+i(w + Za„, o), r/+i(u> + la n , o)). 

Thus the transformed sequences of functions have as limiting functions 

to*(u» — a„), v*(to — a„). 

In other words the totality 2 is carried over itself by T, tho change of para¬ 
meter being w, — w — o„. 

Now let us suppose to and v to be real, with to sufficiently largo and 
positive. The transformation from (u;, r) to (to,, r,) is then a real analytic trans¬ 
formation. and the totality of curves specified above are analytic with the pos¬ 
itive to -axis as asymptotes. In fact these curves 2 have contact of order ft —2 
at least with the 10-axis at «. 

Let us return now to the prepared real ur plane of § 32. The relation 
between to and u shows that in the real ur-plane the curves 2 are defined for 
u sufficiently small and positive, and are analytic curves with contact of order 
ft — 2 at least with the u-axis at (0,0). On account of the mode of definition 
of the curves in the complex domain the inequality | r | < £ (ll) | u | M “ f holds uni¬ 
formly for all of these curves. 

It follows that the totality 2 consists of only one curve. 

In fact, consider the region u > o hounded by these curves and the line 
u — d. If there were more than a single such curve, such a region would ne¬ 
cessarily arise and lie within the region 

o < u < d, 

* Cf. W. F. Osgood, On the uniformitation of algebraic functionr, Annalt of Mathematict, vol. 
14, 1912-191?, pp. I$2—«$ 4 - 


176 




Surface transformations and their dynamical applications. 


67 


By the transformation T we have 

u, ■= u + a u u 9 + 


We see at once that this region goes into another which includes it. For, the 
upper and lower boundaries of the region bounded by the 2 curves are carried 
into themselves (the totality 2 being invariant), while the line u - d is moved 


to 


the right. This is impossible since J jQdudv is invariant under T. 


Passing back to the original i/p-plane, we infer the existence of an analytic 
invariant curve ending at the invariant point and having contact of order of 
(f< —2 with the corresponding formally invariant curve. When the curve is 
represented in the form *> —y»(u) say, <p is continuous together with its deriva¬ 
tives of the first /«— 2 orders for m —o. 

Now u is an arbitrarily large integer. By increasing f i we cannot obtain 
further invariant curves, as is seen at once by a repetition of the above argu¬ 
ment as applied to the region between such curves. Therefore, the invariant 
curve when represented in the form p - cp(u) say yields a function <p analytic 
for mho, continuous together with its derivatives of all orders for u-o, and 
formally coinciding with the formally invariant curve. 

All of the above only applies if a„ >o. But if a„ < o then the analogous 
quantity for T_» is —.a„. Hence we can arrive at the same conclusion if a ix <o 
by considering instead of T. 

Clearly wo can deal with the case «<o by merely rotating the axes in the 
prepared ua-plane through the angle n. 

Let us call a real function /(/) of a real variable t hypercontinuous for t - t 0 
if f(t) is analytic for t*t 9 , |< —/ 0 |<d>o, and continuous together with all of 
its derivatives for / —f 0 . Similarly a curve is hypercontinuous at a point if its 
coordinates can be expressed as hypercontinuous functions of a parameter t. 
With these definitions we can summarize our results as follows: 

In the case II" tvhen there are three formally invariant curves with ordinary 
points and distinct tangents , one or all three of these will be real. To each such 
real formal curve corresponds a unique hypercontinuous curve through the invariant 
pomf which is invariant under T and has the corresponding asymptotic representa¬ 
tion at the invariant point. 

It is clear that the method above is not essentially limited to the discus¬ 
sion of real invariant curves but these are all we need to consider. 


177 



68 


George D. Birkhoff. 


§ 36. Extension to the general hyperbolic case II", II ". 

It is easy to sec that the above work admits of an extension to the most 
general case II". 

Suppose first we fix attention on any real formally invariant curve C in 
the case II" which has an ordinary point at (o, o). 

We can begin as before (§ 32) by taking a prepared ur-plane in which this 
curve osculates the u-axis to order /1. 

The series for u, con be taken to contain a term cuf of least degree p>i, 
where p does not increase indefinitely with jt. Otherwise, when the u-axis is 
made the invariant curve by a formal change of variables, it will be an invariant 
point curve, and such a curve has previously been observed to be analytic. 

The series in r, is divisible by r out to terms of degree /1 + 1 as before. 
The formal series F* consists of a polynomial of degree at most /1 divisible 
by 0 with a leading term ecu” and a formal series with initial terms of degree 
at least ft + x. 

Let us assume e > o and take 

W(u) > o. |c| <|u|'. 

Further let us introduce the variable z - We find easily that, for 9 f(a) > It, 

1 

h.-* + (P-x)c|<£M F_l . 

where E is a suitable positive constant, and we can carry through a discussion 
analogous to that contained in §§ 33, 34. 

Introducing next a variable w, 

u>- -i- + + •••+* log it + ••• + ou 4+ '-', 

U*~ l U *- 1 

we can determine a.fi, ...,a so that 

| u>(u) — ic(it) + (p-OcIcS'M**' 

as in § 34, and con generalize the results there obtained. 

The existence of a unique invariant curve can then be proved as in § 35. 
When the real formally invariant curve has a ’cusp' at (0, o), this can be 
reduced to an ’ordinary point’ by a succession of changes of variables of the 


178 



Surface transformations and their dynamical applications. 


69 


type u = uv, v =* v, and then an argument may be made like that carried through 
in §§ 33 - 35 - 

We will not stop to enter into details, but merely state tbe conclusion: 

In any case II" to every real formally invariant curve corresponds a unique 
hyper continuous invariant curve with the corresponding asymptotic representation. 

In the hyperbolic case II'", T q is of type II". Hence we infer: 

In the hyperbolic case II", 0 =a ^~~~ » the invariant curves under T q are 
of type II ' 1 and their images are invariant as a set under T. 


§ 37. A general property in cose II". 

In case II" under the restrictions of § jy every point of the region u* + 
not on one of the real invariant curves is carried out of the region by iteration of T 
or T-. 1, while every point on one of these curves approaches the invariant point 
(o, o) by iteration of T and is carried out of the region by iteration of T— 1, or 
vice versa . 1 

There may be either three real invariant curves, or a single such curve. 

Let us consider the first of these subcases. Here the neighborhood of (0,0) 
in the plane is divided .into six parts, bounded by arcs of the invariant analytic 
curves. These six regions evidently go over into themselves under T or j. 
Let us consider a particular one of these regions, and let us first take tangents 
to the corresponding arcs of the two invariant curves at (o, o) as axes. The 
hypercontinuous invariant curves have equations t> — <p(u), u — ip{v) referred to 
these axes. 

Make the further change of variables 

U — u — ip(v), F —t ) — rp(u). 

The right-hand members of these equations are continuous together with their 
partial derivatives of all orders in u, v, analytic except for u-oorv-o. In 
the new variables the invariant curves appear as the U- and F*axis, while the 
region under consideration becomes the first quadrant in the U F-plane. 

1 It is apparent from this result that no othdr invariant curves through the invariant 
point can oxist. 

LetiCitita (loc. cit.) proved that certain nearly points are carried away from the invariant 
point in this and other hyperbolic cases, showing that the point is unstable. 

Pee also A. R. Cigala, Sopra un eriterio di instability, Annali di Matematica, ser. 3 , vol. 

it. 1905. 


179 



70 George D. Birkhoff. 

This further change of variables is formally of the type (15) so that we 
have (see § 32) 

U l = U[i + a u U + 2fl,j V + •••]» Vi - V[i—2 a„ U — o„ F + •••]• 

The factors in brackets are analytic for t/>o, F > o of course. We can readily 
show that these factors are continuous together with all of their partial derivatives 
for U >0, V > o. 

In fact consider 


as a point («, v) approaches the invariant curve U = o. By the ordinary rule 
for the evaluation of an indeterminate form the limit will be given by 


lim 


•]u> _ d 

Uu dv t 0til 


at the point in question. Hence tho first bracket, and likewise the second bracket, 
are continuous functions for U> o. F>o. By successive steps of like nature 
all of the partial derivatives of the brackets may be shown continuous. 

In the subcase under consideration there is a third real invariant curve in 
the second and fourth quadrants obtained by factoring formally 

F* — U F[a„ U + a n V + •••]• 

We see that a„ and a„ are of the same sign (say positive), for this third 
invariant curve is given by 

anU + a t I r+ — —o. 

Returning to the explicit form of U„ V, above, we infer that U(V) increases 
and V(U) diminishes under iteration of T(T_,) for any point in the first quadrant. 
If (U, V) approaches a definite point [U, V) with U > o (V > o) within the region 
Ut + F* <d*. this point is necessarily invariant under T. But, inasmuch as there 
are no multiple factors of F* and thus no invariant point curves in the case at 
hand, there will be no invariant point in this region except (o, 0). Thus tho 
first part of the italicized statement holds in this case. 

The part of the statement which deals with the behavior of points on the 
invariant curves is obviously truo in all cases. If it was not we should have 


180 



Surface transformations and their dynamical applications. 

isolated invariant points on these invariant curves lying arbitrarily near to (o, o), 
and this is impossible. 

We have next to discuss the subcase where there is a single real formally 
invariant curve. Let us take this curve into the {/-axis by a transformation like 
that made above. We have then 

U, = U+a tl U* + 2a„ UV + 3 a„ V* + •••, 

V x =■ F[i —2a„ U — P + •••]. 

where the brackets stand for a type of functions similar to those in brackets 
above. 

Here one has 

V[a u U' + a n UV + a„ V *-■• ]. 

The quadratic form in brackets is definite since there arc a pair of conjugate 
formally invariant curves with distinct conjugate directions. 

Wo find 

U,V — V X U - V [ 3 (a n U' + a„ UV + a„ F») + •••]. 


V 

This equation renders it apparent that r — tan“> ^ varies continually in one 

sense under indefinite iteration of T or of T_, as long as a point and its iterates 

I V I 

remain near (o,o). If lim r — o or rr , 1^1 approaches o, and the formulas for 

U x , V x show that | U\ and | V\ vary in opposite senses. In this cose | U | increases 
and the point cannot remain near (o, o). 


Moreover the point cannot remain near (o, o) in the contrary cose. If 
limr«=rHo,»r the obove equation shows that V approaches o; in fact the 
variation in x is of the first order in V. The variable U must likewise tend to o. 

y _ y 

But the geometry of the figure in the plane makes it clear that ^ ~Q must 


approach tan r indefinitely often, at the same time. If we recall that ^ 
approaches this value also and employ the formulas for U x , V x , we find readily 


whence 


— 2fl„ ta n x fl ( > tan * r _ Uq . 
a u + 2a,, tan r + 3a., tan* r 

3 tan r(a„ + a„ tan r + a 0 , tan* r) — o. 


which is impossible. 


181 



72 


George D. Birkhoff. 


§ 38. Extension to a more general case II". 

The same property holds in the most general hyperbolic case. 

The kernel of the method of proof employed in § 37 depends on the use 
of a function which increases or decreases upon iteration of T. This method 
can be applied to a somewhat more general case than has been treated above, 
namely that in which all tin real directions of formally invariant curves at (o, o) 
are distinct. It is this case which we treat first. In dealing with the most 
general case (§ 39), however, we are obliged to employ less direct means. 

Suppose that the property fails to hold, so that there are real points not on 
an invariant arc which remain in an arbitrarily small neighborhood of (0,0) 
under indefinite iteration of T (or of T-i, if not of T). 

There will then exist such points in some one of the regions into which the 
invariant arcs divide the vicinity of (o. o), and it is upon such a region that we 
fix attention. For the present we assume there is more than a single real 
invariant curve. 

By a change of variables tf-u— W»>. F-t> — ?(“) (§ 37 ). the region 
between the invariant boundary arcs may be taken into the first quadrant, in 
such wise that the invariant arcs become the U- and F-axes. The variables (J, V 
are analytic in u, v, save for u —o or r —o when U, V are continuous together 
with all of their partial derivatives. Furthermore we have 

+ + ]• V,-V[ x-(« + P®+ -] 1 

where UVH is the homogeneous polynomial of lowest degree m>3 in F* t and 
where the brackets stand for functions analytic for U * o, V h o, and continuous 
together with all of their partial derivatives. The factors U, V in U VH correspond 
to the invariant axes. The factors of H are either real linear factors aU + fi V(a[i >0) 
or complex linear factors, since there are no real invariant curves in the first 
quadrant of the C/F-plane. Hence // is of one sign, say positive, near (0,0) 
and of the order m— -2 in V £/'+ F*. 

From the above equations and the facts stated we have 

U,V— V t U — UY[mIf + •••]> °; 


in consequence r —Un" 1 ^ varies continually in one sense upon iteration of T 
or of T- 1, and must approach a limit. 




182 



Surface transformations and their dynamical applications. 


73 


Thia limit must be o or -In fact since there are no invariant points 

near (o, o) (invariant point curves correspond to multiple factors of /*), the 
corresponding point would necessarily approach (o, o) in the contrary case. If 

V 

r > o denotes the corresponding lim jj the fraction 

r.-r vfzt^ikz 

u ‘- u ~ v \ 



can be made nearly equal to r with negative denominator and numerator; this 
is easily seen geometrically. Hence we have (compare § 37) 


H + V 


lim 


OH 

rv 


H+U w 




along this direction, whence H - o for This direction will correspond to 

a real formally invariant curve, which is absurd. 

Also this limit is not o, for the formulas for U t , V x show then that |£/,|, 
|F,| will vary in opposito senses and the point will recede from (0.0) along 

the ( 7 -axis. Similarly the limit is not 

This completes the discussion when there is more than one real invariant 
curve. The argument is easily modified to meet the case of a single such curve 
(compare § 37). 

The property of § 37 holds therefore if the real tangent directions of the formally 
inmriant curves are all distinct. 


§ 39. Extension to the general case II", II"'. 

We propose to deal in outline with the general case II". As before, we 
assume the property not to hold, and show that a contradiction results. 

The region under consideration is bounded by two invariant arcs which may 
or may not have the same tangent direction. An argument like that used above 
may be partially applied. If we form the difference u,v — r,u, it is given by a 
series beginning with a constant multiple of the homogeneous polynomial of 

Acta mathtmatiea. 43. Imprtme !e 23 man inn. 10 



74 


George D. Rirklioff. 


lowest degree in F*. But for directions within the region making this polynomial 
vanish there must be an even number of equal factors au + (tv, since if there 
were an odd number there would be at least one corresponding real formally 
invariant curve and thus an invariant curve within the region, contrary to 
hypothesis. Hence u,r — v,u preserves a constant sign save near these critical 
directions. 

Moreover, if. under iteration of T, a point moves away from the vicinity 
of such a critical direction, it rotates in a constant sense about (o, o) to the 
vicinity of the next following critical direction (compare § 38). 

Since there are only a finite number of such critical directions, there will 
then be points remaining in the indefinitely small vicinity of one such direction 
under indefinite iteration. It is upon such a critical direction and its neigh¬ 
borhood within the region under consideration that we now fix attention. 

Let us take this critical direction along the positive u-axis, and make the 
change of variables. 

u *= u, v — u vi 


in the new variables the transformation T then ia readily found to have the 
form II" 

fi, — u + 


The series F*(u, uv) is of course formally invariant, and in general the same 
methods of formal reckoning apply as earlier. 

The first distinction to be noted is that the invariant integral 
/% 

becomes J fuQdudv; the new quasi-invariant function uQ is analytic but 

vanishes at (o, o). The second distinction is that the line ix-o in the ub-plane 
is evidently an invariant point curve corresponding to the invariant point (0,0) 
in the u e-plane. 

Also there are infinitely many points u > o in the uv plane which remain 
near (0, o) under indefinite iteration of T, and yet do not lie on an image of an 
invariant curve in the tir-plane. 

The formal differential equations (9) in u*, t* are clearly 


Si 


Qdudu 


UkQk 


dvk 

dk 


OF* 

d Vk 


UkQk 


diik 

dk 


dF * 

diik 


where by F* is meant the series F*(u,nv). 


184 



Surface transformations and their dynamical applications. 


75 


In the uv-plane there are also certain critical directions, finite in number, 
along which the points above referred to cluster. In fact u,v — v,u has the 
3„ m e initial terms as 

&r-ai +v -»ij- 

As before, the lowest terms here form a homogeneous polynomial in u,v of one 
sign or zero for u>o. 

Repetition of the reasoning and further like changes of variables can now 
be made.. It may be observed that the invariant point curve w —o introduced 
at any stage is either eliminated by a further change of variable, or corresponds 
to the new u-axis. Consequently the extraneous invariant point curves are either 
or u ■*» o and V“0. 

Since there are only a finite number of formally invariant curves and the 
changes of variables used lower their order of contact, a stage must finally be 
reached at which either (i) there is no formally invariant curve not of extraneous 
type or (2) there is only one such curve. In case (2) it is clear that we may 
assume this curve to have an ordinary point* at (o, o) with tangent direction 
distinct from that of an extraneous invariant curve; the changes of variables 
employed separate formally distinct curves and eliminate a ’cusp*. One further 
change of tho same type will then make u —o the only extraneous invariant 
curve. 

Let us begin with case (x) when u —o and — o are extraneous. 

Since J'j Q(u,v)dudv is an invariant integral it is clear that points 

0 (ti, v) —0 go into points — o. Thus Q *= o gives a set of real analytic 

curves invariant under T. Such curves are necessarily individually invariant 
inasmuch as u =* o is invariant. But there are no such curves save u — o and 
y*o. Hence we have 

Q(u, v) — £'»"•/?(*, v), (/ > 0, m > o), 

where R{o, o)* 0. 

Also F* — o yields formally invariant curves so that 

F*-u*>v*G(u, v), (p>l + 1, q>m + x), 

where G is a formal power series with constant term. 

Now by the formal differential equations (9) for this case we have 

R \% “ oildie = ** G] • 


185 



76 


George D. Birkhoff. 


Thus the series for u,, t;, have the form 

u + uP-‘iVJ- m ~ l [qc + A], v + u^-' »*-"[— pc + B) 

respectively, where A and B are power series without constant terms. Under 
iteration of T or T_i either |u| increases and |v| decreases, or vice versa. Con¬ 
sequently a point which remains in the vicinity of (o, o) will approach a limiting 
point on the u- or t>-axis, distinct from (o, o). 

For definiteness suppose the point to lie in the first quadrant with c>o. 
Such a point will then approach a point of the positive ii-axis near (o, o) under 
iteration of T. But the series above show that for such a point 

^?>-2 Kv. (X>o). 


Thus t> decreases less rapidly than if 



when by integration we find t> — ce~*“. Hence u cannot approach a limit as v 
approaches o but must increase indefinitely. 

The case when only u —o is extraneous admits of similar disoussion. 

Case (x) is now disposed of. Let us consider case (2). 

By a formal change of variables of the type employed in § 37 we may take 
u —o as the invariant point curve and v — o as the other invariant curve. 
Formally then we are essentially in case (x), above disposed of. Indeed if the 
invariant curve is analytic no modification is required. 

If the hypercontinuous invariant curvo is not analytic there can be no 
corresponding factor of Q, i. e. after the new change of variables we have 

Q(u, v) — u l R(v, v), (l> o), 

where R{ 0, 0) * o. 

Moreover, after this change of variables, v only occurs once as a factor of 
F*. For a multiple factor gives an analytic invariant point curve (§ 20). Thus 
we have 

F*-uPvG(Q,v), (p>l + 1), 


where G( 0, o) r* 0. 


186 



Surface transformations ami their dynamical applications. 


77 


Consequently we have here 

u, — u[i + + •••]. v, = v[i — cpuP- 1 -' + - 1 . • 

where c^o. The brackets stand for functions continuous together with all of 
their partial derivatives (see § 37), and with the asymptotic representation in¬ 
dicated at (o, o). 

But the points remaining near (o, o) under indefinite iteration of T or T-\ lie 
approximately in the direction of the u-axis; otherwise, before the above non- 
analytic change of variables was made, we might have removed the invariant 
curve by another change of variables, and thus have arrived at case (1). 

As a result | u | increases and |r| decreases. The above formulas demon¬ 
strate this fact. This possibility is excluded since there are no invariant points 
near (0, o) not on u — o. 

Thus case (2) is also disposed of. 

Since in case II”', T q is of type II” we may state: 

The property of § 37* holds in the most general hyperbolic cases II", 


§ 40. The hyperbolic case III', III”. 

The non-specialized case III' i9 of hyperbolic type as appears from an 
inspection of (13). If we assume that the coefficient of r* in F * is not zero, 
we obtain a real formally invariant curve with ou9p at (0,0). 

Now by the change of variables (sec § 6) 

u — uv, r — v, 

T takes the form II”. By the use of the methods of § 32—39 we can infer. 

In the hyperbolic case IIl to each real formally invariant curve corresponds a 
unique hyperconlinuous curve which is invariant under T and has the corresponding 
asymptotic representation at the invariant point. 

In the hyperbolic case III" the invariant curves under T t are of type IV and 
their images are invariant as a set under T. 

The property of § 37 holds in the hyperbolic case III. 


§ 41. Invariant curves and the hyperbolic case. 

We aim finally to show that a certain kind of converse to the above can 
be found: 


187 



7 g George D. Birkhoff 

// T is a conservative transformation /', //', II", HI' lor which (o, o) is an 
invariant point, and if there exists an invariant continuous arc ending at (o, o) for 

which tan-»£ remains finite, then the invariant point is hyperbolic and the invariant 

arc is an arc of a hypercontinuous invariant curve obtained above. 

If the iuvariant point can be proved hyperbolic the remainder of the state¬ 
ment can be demonstrated at once. In fact all points not on one of these hyper¬ 
continuous arcs leave a definite vicinity of (o, o) under both T and T_,. ac¬ 
cording to the general property developed above. But the invariant arc is 
carried into part of itself either by T or T_,. Therefore it must consist of 
points on one of the hypercontinuous arcs. 

Let us take first the general case when T is of type I' or II' at the in¬ 
variant point (o, o) and let us suppose if possible that T is elliptic at that point. 

Let a, ,1 be the upper and lower limits of tan” 1 £ along the curve. These 

are invariant under T of course. Hence the lines through (o, o) in these direct¬ 
ions are carried into curves tangent to these respective lines at (o, o). Thus 
we have the phenomenon of invariant directions, which is absurd in case II'. 

Hence T is of type I', and (o, o) is a hyperbolic point. 

If T is of type IT every direction through tho invariant point is invariant. 
It is necessary here to have recourse to a more elaborate argument to show that 
T is hyperbolic. 

For definiteness we assume that T carries the invariant aro into part of 

itself. Define o and ,1 as above. If <W «« « nd a line * “ c “ whioh 
intersects the invariant arc infinitely often near (o.o). But the image of this 
line lies on one side or the other of the line near o, at least near (0,0). since 
T is analytic. Thus it is apparent that the total area between the line and in¬ 
variant arc on the same side of the line is carried over into part of itself by T. 
which is absurd. Hence there is only a single limiting direction, i. e. a-/), 
and the invariant arc does not meet the corresponding line ti - c« near the 
invariant point. 

This direction corresponds to the real tangent direction of a formally in¬ 
variant curve. Indeed the arguments employed in § 38 show that points not 
approximately in such a real invariant direction from the invariant point are 
rotated into such a direction under iteration of T, provided that the point re¬ 
mains near (o. 0) as is the case for a point of the invariant arc. 

This formally invariant curve which has a real tangent direction will cor¬ 
respond to a real formally invariant curve in general so that we have the hyper¬ 
bolic case II”. 


188 



Surface transformations and their dynamical applications. 


79 


There remains the possibility, however, that we have an even number of 
formally invariant curves with real tangent directions but not with oil coef¬ 
ficients real. Here further consideration is required. 

Take the straight line from (o, o) in the limiting tangent direction as the 
w-axis and write 

u — u, v — vv 

as in § 39. The v-axis in the uv-plane is a line of invariant points under T, 
and the invariant arc approaches (o, o) in this new plane. But this invariant 
arc does not cross the line of invariant points of course. 

Repeating the argument above we infer that this arc approaches (o, o) in 
a definite limiting direction in the uv- plane. But it was established in § 39 that 
such a limiting direction can only be along a real tangent direction to a for¬ 
mally invariant curve. Hence again we argue that the invariant arc has the 
direction tangent to i-o or toa formally invariant curve, when another change 
of variables as above is in order. 

At each stage these changes of variables diminish the number of real co¬ 
efficients in the series for the formally invariant curve, until at last the first 
coefficient is not real and there is no invariant direction. This is impossible 
by our argument for case (x), § 39. 

Similarly the oase III' is disposed of. 


Chapter III. Elliptic invariant points. Stable case. 


§ 42 . Existence of closed Invariant curves in the stable case. 


In the intcgrable elliptic case there is a family of closed analytic curves 
F* = const, about the invariant point, each invariant under T but not of the 
type above considered since these curves do not pass through the invarjaut 
point. Suoh an invariant point is stable of course. 

A somewhat analogous property can be established in the non-integrable 
stable case. Let us understand by a closed curve the boundary of a simply con¬ 
nected open continuum in the finite plane, while regarding that plane as* com¬ 
pleted by the adjunction of a 'point at infinity’. 

In the stable case there exist an infinite number of invariant closed curves sur- 
rou ndxng t he invarian t point and lying within any prescribed neighborhood of it.' 


, In ‘ P° mpare lhe melhod ° f proof "ith a proof given by H. PoiKCAi.fi, Les method* ftouvelUt 
<U la mioamqve e/U$te. vol. 3. Pari*, 1899, pp . , 49 —, s ,. 


189 



80 


George D. Rirklioff. 


Choose Any arbitrarily small neighborhood of the invariant point. It is 
then possible to find a second neighborhood r<6<d such that any point of 
this latter neighborhood remains within the first under indefinite iteration of T. 
This is the direct statement of the property of stability. 

Now the open region r<d and all of its images unc^er T include (o, o) as 
an inner point and overlap. Let us speak of a point P as occluded by this set 
of regions if it is possible to draw a regular closed curve lying entirely within 
the set and enclosing P and (o. o). The set of occluded points 2 is evidently 
an open simply connected continuum containing all of the set of regions. 

The image continuum 2, is also made up of points 2; the curve enclosing 
P and (o, o) is carried into a curve enclosing P t and (o, o), lying within the set 
of regions, and so P t is occluded, i. e. is a point of 2 . 


Now 2 cannot contain points not in 2 , since then 


U 


Qdudv would be 


larger over 2 then over 2,. Hence 2, coincides with 2. The boundary of 2 
is therefore an invariant curve lying in the arbitrary neighborhood d < r < d and 
surrounding (o, o). Since d is arbitrary there is clearly an infinitude of such 
curves, invariant under both T and T-\. 


Conversely, if there is an infinitude of such invariant curves about (o, o), that 
invariant point is clearly stable. 


§ 43. Some fundamental properties In the case II', / —1. 

The cases I', II', / — 1, may be regarded as constituting the non-specialized 
case of an invariant point. In the second of these cases we have the first pos¬ 
sibility of stability. The discussion of this elliptic case II', / —x which we shall 
make (both in the stable and unstable case) will be based on certain properties 
established in the present paragraph. 

Let us choose variables u, r which osculate the normalizing variables U, V 
of § 22, formula (19), to the order 11 (11 >2). We will then have 

u, = u cos [0 + r.(u* + r*)] — r sin [0 + c(u* + *••)) + P(u, r), 

(27) 

— U Sin [0 + c(fi* + r*)] + r cos [0 + c(u* + r*)] + £(u, r), 

in which P, Q are given by convergent power series which begin with terms 
of the (11 + i)th or higher degree. 

We shall assume c > o for definiteness. It is clear that in the contrary case 


190 



Surface transformations and their dynamical applications. 81 

T-x will be of this same form with —c replacing c, so that our assumption is 
no essential restriction. 

The particularly simple integrable case P =• Q = o affords a clear insight as 
to the character of T. Circles with (o, o) as center are rotated into themselves 
through an angle 0 + cr\ increasing with or decreasing with the radial distance 
r according as c>o or c<o. 

This special case shows clearly the vortical nature of the transformation T 
in the neighborhood of the invariant point. 

It will be convenient for us to introduce polar coordinates r, tp. In these 
variables the equations above take an equivalent form 

< 28 > r,-r+ R(r, tp), *Px - <P'+ 0 + cr* + S(r, tp), 

where 


(29) 


[*" + 2 r(/ > co 3 ^TTTcr^ 7 T^irT(^Mr+^»)) + P* + 0 * —r, 

S - tan-> + 0 + cr 9 ) + Q cos (tp + 0 + cr») 

r + P cos (fp + 0 + cr») + ^sin (tp + 0 +Tr*\ I' 


S.nce P, Q are analyt.c power series in r beginning with terras of degree ,t + 1 
or higher, and with coefficients analytic in <p with period 2 /r, it is apparent 
that R, S are continuous functions of r.tp for r>o, expansible as power series 
.n r w.th coefficients analytic in tp of period 2 rr. The first of these will begin 
with terms of at least the (it + x)th degree in r. while the second will begin 
with terms of at least the /ith degree. * 

These same considerations show that R, S admit continuous partial deri¬ 
vatives in r, fp of all orders. 

The coordinates r, <p will be regarded constantly as rectangular coordinates 
■n an r,,.pl.„e, 80 that the r . ali , correspond, to the invariant 

th si 4 : but °, l ° ‘ ” rf ane - TW ° P ° inU f ° r Wh ' Ch the ^ioates r are 
the same, but for wh.ch the coordinates 9 differ by a multiple of 2jr are 

SuTrT"'' T I° C ° ngrUent P° inU "present the same point of the u,-piano 
ppose now that we have a point in'the r^plane and a direction at the 

point given by which is the reciprocal of the slope. The corresponding 
reciprocal slope at the transformed point under T is then given by *£■. This 


quantity may be computed by means of (, 8 ) and has the value 

Acta mathematical 41 Inprla« I* « IOT0 


11 


191 



82 


George D. Birkhoff. 


dtp dS 

d v , 57 + 2er + i 7 
*■" t+d R 


where the indicated differentiation is directional in character. 
From this equation there results 

dS I d(p\dR 
d V,_ d <r_ dr \ 2cr + dr)dr 
dr, dr 2cr + --- 


If we evaluate the directional derivatives on the right in terms of the partial 
derivatives of It, S, we perceive at once that 


and thence 





as long as |~|say. 
That is, we may write 


(30) 


dip , _dj£ 

dr t dr 


2er + xr"~' (, + (jy)'J • 




where |z|<A' as long as £ is restricted as stated. 


Let us term the small sheaf of slopes ^ at any point in the n/>-plane 
for which 


(3i) 




the barred angle. When (31) is not satisfied, the left-hand side of (30) is positive. 
Our conclusion may be formulated as follows: 


192 



Surface transformations and their dynamical applications. 


83 


In the r/p-plane the transformation T leaves r unaltered to terms of order u + i in 
r, increases fp by 0 + cr • to terms of order ft. and rotates any direction not in the 
barred angle in a negative sense. 

An entirely analogous discussion of T -j shows that, if e be taken small 
enough in defining the barred angle, we have 

In the rrp-plane T-\ rotates any direction not in the barred angle in a posi¬ 
tive sense. 

A quantitative discussion of the amount of rotation of directions can be 


based on (30) and a similar equation for 

rfr_i 


In particular we note that 

Under iteration of T ( 7 , -i) any direction at a point is ultimately rotated into 
or past a barred angle negatively ( positively ) if r remains small. 

If possible assume this statement not to be true. 

In the first place wo must have lim r —o. For in the contrary case there 
is a rotation of definite negative amount occurring indefinitely often, and the 
statement must hold. 

If we let fp' — ^ the inequality (30) gives 


J'/>Zar lt (a > o), 

as long as r remains small and ef does not lie in the barred angle. At the same 
time the formula (28) for r, shows that 

\Jr\<br l »", (b > o). 

Consequently we have 


£l<±rr. 

dtp | 2 a 

Hence r diminishes as fp' increases not more rapidly than if 

dr _ b 

dtp’ ™ 2 a T ' 

But a direct solution of this differential equation establishes the fact that 
must increase indefinitely and be of order at least r 1 - if , is to approach zero. 

Th.s ,s in contradiction with the hypothesis that the direction is not rotated 
into or past the barred angle. 


193 



84 


George D. Birkhoff. 


Thus we see that the stated property holds for T. In the same way it 
may be proved for T_j. 

The following property is also useful: 

Given an arbitrary positive K, then for any point (r, (p) with o <r <6 (6 suf¬ 
ficiently small) we have 

<Pn >9 +n 0 + K 

for n>N until r n >d. 

From the equations (28) we get the inequalities 

k. —r|<cV, '/>, —•/>># +e"r*, (o' > o, c" > o). 

From the second of these inequalities there results 


*»-i 

•pn>'P + nO + c"2r! 

J -0 

as long as r, r,, 1 are less than d. It suffices to prove that the sum on 

K 

the right exceeds before r H >d, provided that r is sufficiently small. Now 
from the first inequality we deduce 



p -1 


r »i <c'2 r ;- 

>-» 


If r p and r 9 are the maximum J/ and minimum m of r>, this yields for 11 > 3 

whence 





Consequently, if r is sufficiently small and varies to a relatively much larger 
(but still small) value or to a relatively much smaller value, the corresponding 

sum 2 tr) is very large and exceeds • 


194 




Surface transformations and their dynamical applications. 


85 


§ 44. Nature of the invariant curves. 

Let us define a regular neighborhood of an elliptic invariant point (o, o) as 
a neighborhood such that any radial direction in the rr/-plane is rotated through 
a negative angle by T, and through a positive angle by T- 1. 

Probably a hyperbolic point cannot lie in a neighborhood of this kind. 

The reasoning of § 43 shows a regular neighborhood of this type to exist 
in case II', / — 1. 

The elliptic cose does not arise in the general case II'' or III'. But a 
direct computation shows that, in case II', / finite, and in what may be termed 
the general elliptic subcases II" and III', a regular neighborhood exists. 

Throughout such a neighborhood we can evidently construct a barred angle 
through each point of the neighborhood such that directions outside the barred 
angle arc rotated negatively by T and positively by T- t , and ultimately are 
rotated into or past the barred angle. 

In a regular neighborhood 0/ an elliptic invariant point of type IT, /—1, 
any invariant curve enclosing the invariant point meets every radius vector through 
the invariant point in only one point. 7 / the barred angle in the plane be drawn 
at the corresponding point the curve lies entirely within it on either side in the vi¬ 
cinity of the point. 

In order to demonstrate this fundamental property of the invariant curves 
wo make use of the ry-plane employed above. 

Lot us suppose first that the invariant curve L under consideration is de¬ 
fined by means of an inner simply connected open continuum 1 * containing 
(o, o) in the wu-plane. 

If the first italicized statement is not true consider the continuum of points 
accessible from r - o along a perpendicular line , r - const, in the r ./'■plane without 
passing a point of the invariant curve itself. 

This open continuum r* forms all or part of 
the open continuum V bounded by the in¬ 
variant curve L and r - o (see figure). The 
boundary of /* is evidently a closed curve 
made up of points of L and open segments 
of lines <y> — const. 

If I' and /'* coincide then either the 
first property is true or there exist one or 
more boundary segments <p - const, of /'and 



195 



86 George D. Birkhoff. 

/'*. Now either T* lies to the right or to the left of such a segment. In the 

first case the tangent to the boundary makes an angle J with the «y>*axis. An 

application of T~ x will carry this segment into another with tangent argument 

greater than j. But, on account of the form of the boundary, any tangent 

argument must be intermediate between — ? and so that this is not possible. 

In the same way it may be concluded that r, T* cannot lie to the left of such 
a segment tp — const. Hence there is no such segment, if /' and r* coincide. 

In this case of coincidence every radial line must meet I' only once, as we 
wished to prove. 

Let us now turn to the case when the two continue I' and r* are distinct. 
Consider the part f of 1 ' accessible from r —o along a regular simple curve 

in /’ (such as AfN in figure above) with tangent argument never less than -• 

2 

This part of r evidently includes f, but can only coincide with /*• if there 
are no bounding segments tp — const, of /*• which have a part (see region l of 
figure above) of /' on the right. By the transformation T_i which increases 

every tangent argument which is equal to ? and does not diminish to ? any 

greater argument, the points of f are carried into points of r which are still 
accessible from r—o along an auxiliary regular simple curve in r with tangent 

argument greater than ?, namely along the image of the auxiliary curve. Thus 

the continuum f , forms part of f. Hence f, coincides with r, since T is 
conservative. 

Consequently /’• has no boundary segments tp — const, with part of T on 
the right. Similarly we can exclude the possibility of boundary segments r/) — 
const, of /*• with part of r to the left. Hence r* coincides with /\ The 
first italicized statement has previously been demonstrated in this case. 

It is now easy to show that the invariant curve lies within the barred 
angle in the vicinity of any one of its points. 

Suppose for example it lies partially above the upper right arm of this 
angle. By sufficient iteration of T— x the direction of this arm rotates positively 
past the vertical, and it is intuitively clear that the radial line tp — const, through 
this point will meet the invariant curve more than once, contrary to what has 
been proved above. 

If the invariant curve is defined by means of an outer continuum, essentially 
the same argument leads us to the same conclusion. 


196 



Surface transformations and their dynamical applications. 


87 


§ 45. Rotation numbers. 

Consider any closed set of points defined by an angular coordinate i of 
period 2tt . Let us suppose a transformation given which takes each point of 
the set into a definite point r,, in such wise that if P precedes Q (i. e. the r of 
P is less than the r of Q) then P , precedes or coincides with Q ,, and also such 
that r, varies continuously with r. In particular if P and Q are the same point, 
represented by angular coordinates r, x' differing by 2 l:r, then r,, r,' differ by 
2 1ft also. 

Consider now the difference r* —r for all points P.. If r increases through 
all the values of the set by 2 n so does r*. It follows that we have 


a^<ri — r (&<*><<,«*> + 2 n) 

In fact suppose r* —r is a minimum for r t* and let r vary by ztt from this 
minimum. Since r k increases but only by 2 n altogether we have at the 
maximum of r* — r 

— r < rj — r* + 2rr, 

which establishes the statement. 

There is a number a such that for every k 


a<*> 6«*> 


For, if this i, not th, case, two interval, 6 “’). {?*. wiM fai| 
common point so that for instance 


to have a 


whence 


&<*) a 1/) 

T < T* 

lb'»<ka"K 


But, since r*-r<*C*> for any r, we have successively 

r* — r < r 2 * r* < 6**),... r/4 _ r(/ _ |)4 < £,<*> 

whence by addition 


Tlk — T <!!/*'. 


197 



88 George D. Birkhoff. 

Also since 77— r > a (n for any r we get similarly 

rik — T>_ka {ti . 

These two inequalities and the inequality written above are contradictory. 

The number a will be called the rotation number of the transformation 
r,—/(r). 1 Evidently a measures what may be regarded as the mean angular 
advance of the points under this transformation, inasmuch as we have for any 
r and k 

| Tk — i — ka \ < z:r. 

Since ku lies on the interval (a (4) , b {k) ) some points advance more than ka, 
and some by less than ka. 

When r, = /(r) is a one-to-one transformation, then its inverse has evidently 
the negative rotation number — a. 

If ~ is rational, say — P >9 relatively prime integers, then we have 


and hence 


Q** 1 < 2 p;r < > 

9 = q ~ 9 
g««»< 2 prr < 


Consequently r q —r is exactly equal to 2 prr for some r. It follows that, 
if 2 ~ ” ^ • l * lcrc •* at ^"t one point P for which r increases by precisely 2 p,r 
upon q iterations of the transformation. 


§ 46. Rotation numbers along invariant curves. 

Returning now to the invariant curves about an invariant point in the 
elliptic case IT, / — 1, it appears that for each such curve there is a definite 
rotation number, for T yields a one-to-one, continuous transformation of each 
such curve into itself which preserves order. 

If such an invariant curve has a rotation number commensurable with i:r , say 

i* made up of a finite number of analytic arcs ending at hyperconlinuous 
points, invariant under T q . 

1 Introduced by Poixcark (loc. cit. § 8). 


198 


Surface transformations and their dynamical applications. 89 

For mark off the invariant points under T q on this curve about (o, o). 
There exists at least one such point by the remark proved in § 45. The invariant 
curve near these invariant points forms then invariant curves in the sense 
of § 41. These points are thus hyperbolic and the invariant curves are 
hypercontinuous at the invariant points. But, by indefinite iteration of T q or 
T- q , the part of the arc is carried into all of itself, since there are no invariant 
points on the arc save at its end points. 

There are only a finite number of isolated invariant points on the invariant 
ourve under T q or else the limit invariant point would have a non-analytio 
invariant curve of the type excluded in § 41. 

Thus the statement is proved. 

// two such tnvanant curves have one or more points in common, the rotation 
numbers o\ the two curves are the same and of the /arm 2 | 2 . These common points 
and arcs are finite in number and invariant under T q . 

If two inv.ri.nt curves have point, in common, but nevertheless are not 
coincident, there will be one or more continua included between them. Since 
the invariant curves are each cut only once by a radial line >p = const, in the 
rrp-plane, these continua are of the form 




tp £. ( p SV* 


where f,g are continuous functions of <p with l<g for rp'«p<q," and I — a for 

P — tp' or rp — rp". 

Evidently the transformation T carries any continuum of this type into 
another of the same type, included between the same two invariant curves. By 
further repet,tion of T this continuum is carried into other, which cannot all be 

distinct inasmuch as f f Qdud. ha, the same value for any of them, and the 

total value of this integral taken over the complete neighborhood of the invariant 

Itself and t , k T ? ,ter ‘ tion8 the ori « inal continuum is carried into 
itself, and its two boundary arcs are carried into themselves. The end points 
of these arcs are therefore invariant under T q . 

If these invariant points are rotated v times around the invariant point 

by T„ clearly the rotation number belonging to either invariant curve is 122. 

This demonstrates the first part of our statement. ? 

n “°T er “ ha8 beeD 8 ' en ab0Ve that OD such an ‘here are a finite 
umber of invaria nt points and invariant point arcs of the specified type under 

Ada mothimaltca. «. I.prla* 1. 22 am IOTO. ** 


199 



90 


George D. Birkhoff. 


T q . Any point of an invariant arc terminated by invariant points will approach 
one end or the other under indefinite iteration of T q or of T— q . 

If two such invariant curves are entirely distinct from one another, the rotation 
number of the one further removed from the invariant point is the greater, and both 
rotation numbers exceed 0 . x 

Consider the two curves in the r^-plane. Since T carries a line tp — const, 
in a regular neighborhood into a curve cut by any line ip — const, at most once 
we see that tp % for the outer curve exceeds «/>, for the inner curve, and both are 
greater than the initial tp by more than 0 . Hence we can affirm that tp l along 
the outer curve exceeds tp, along the inner curve by a definite amount < 5 , and 
that this (p lt in turn, exceeds the initial tp by at least 0 + <J. 

This fact shows at once that the rotation nnmber of the outer curve is at 
least as great as that of the inner curve. For, points initially with the same tp 
on the two curves are taken into points such that the tp of the outer curve 
exceeds that of the inner curve by at least 6 under indefinite iteration of T. 

To establish our statement that the outer curve has the greater rotation 
number it is thus only needful to exclude the possibility that their rotation 
numbers are the same. 

If the two rotation numbers are the same and rationally related to 2/r, 
then for some q the transformation T q will have a rotation number 2 pn (p an 
integer), so that there will exist points on both curves which are carried into 
themselves by this transformation, tp being increased by precisely 2 pit. 

Suppose now that wo follow the transformation T q by a rotation in the 
ur-plane through 2 p complete negative revolutions. The resultant transformation 
will evidently be conservative with the same invariant area integral as before, 
and the two curves will appear as invariant curves with the rotation number o. 
It is also evident that by this resultant transformation points on the outer curve 
have their coordinate tp increased by at least d more than the increase in the 
like coordinate of the corresponding point on the inner curve. 

Construct a curvilinear quadrilateral in the rr/>-p 1 ane as follows. One vertex 
will be an invariant point of the inner curve under the resultant transformation 
and a second vertex the corresponding point on the outer curve. A third vertex 
will be the first invariant point on the outer curve with greater tp, and a fourth 
the corresponding point of the inner curve. The quadrilateral will then consist 
of the two radial segments tp =» const, through the two pairs of corresponding 
points, and the two arcs of the invariant curves included between them. 


The assumption c > o is* .-(ill made This entails no specialization of course. 


200 



Surface transformations and their dynamical applications. 


91 


Consider the image of this quadrilateral under the resultant transformation. 
The invariant points remain fixed, but the two sides <p = const, are carried into 

curves with tangent argument which is everywhere less than the argument 

before the transformation. In the image quadrilateral then, the curvilinear side 
through the invariant point on the inner curve lies to the right of the point, 
while the opposite side through the other invariant point on the outer curve 
lies to the left of this second invariant point. Consequently the quadrilateral 
has been taken into part of itself, the two sides formed of arcs of the invariant 
curves being carried over into part of themselves. This is impossible of course 
with a conservative transformation. 

It is still easier to dispose of the case when both rotation numbers O' are 
assumed to be equal but not rationally related to 2/r. Here again if a point on 
the outer invariant curve has a coordinate r/» not less than that of a point on 
the inner invariant curve, then, under indefinite iteration of T, it will always be 
true that the coordinate of any image of the first point will be greater by at 
least d than the coordinate tp of the like image of the second point. 

Choose now a positive integer q such that qt/ is less than some integral 
multiple of 2/r, say 2p.T, by a quantity less than d. It is always possible to do 
this precisely because // is incommensurable with 2:1. Every point on the inner 
invariant curve will then be advanced by less than 27 i.i under T q . On the 
other hand, since the transformation T q has a rotation number q 0 \ it is always 
possible to choose a point of the inner curve w hich has its tp coordinate increased 
by at least qt / under T q . The corresponding point of the outer curve then lias 
its ip coordinate increased by at least as much as qO' + d i. e. by more than 
2p/r. Hence the rotation number of the outer curve under T q is at least as 

great as . This is impossible. 


§ 47. O11 rings of instability. 

If C, and C , are invariant curves in a regular neighborhood of an invariant 
point in the stable case II', there may either be further invariant curves on the 
ring C,C, or not. If there are, the ring C, C, may be subdivided further into 
similar rings. Thus the neighborhood of the invariant point is divided into an 
infinite succession of rings of instability (reducing to single invariant curves in 
the integrable case). Each ring of this sort is bounded by two invariant curves 


201 



92 George D. Birkhoff. 

C' C" and has no invariant carve upon it other than C' and C". We shall only 
prove: 

Let C 1 , C" be entirely distinct invariant curves forming the boundary curves 
of a ring of instability. Then for any e > o an integer N can be assigned such 
that a point P' exists within a distance f of any joint P of C' (or C") which 
gses into a point Q' within distance e of any point Q of C" (or C') in n<N 
iteration of T (or T_i). 

In the contrary case points P and Q exist for which no point P' can be 
found for some Q* and any N. Consider a small circle with P as center and of 
radius s in this case. By iteration of T this region is carried into others, all lying 
partly within the ring, but not extending to C". Consider the open continuum 
lying outside of C and occluded by all of these regions. This continuum is 
carried into all or part of itself by T. But it cannot be carried into part of 
itself. Hence the boundary curve is invariant under T. But this curve is distinct 
from C" as well as C, since it does not approach within distance s of the point L. 
Such an invariant curve does not exist in C'C by hypothesis. 


§ 48. The other stable elliptic cases. 

In the case II', l r* 1 but finite, the fundamental equalities (28) may be 
replaced by 

r, — r + R(r, «/>). <p, — tp + 0 + cr 11 + S (r, tp), 

where R, S have the same character as before. Hence we see that a regular 
neighborhood of (o, o) exists in this case. Hero the arguments made above for 
the case II', /—1, apply without modification. 

This is also true in the general elliptic subcases II", III' (see § 45). 

In the case . II', l finite, in the general elliptic subcases II", IIP, and, more 
generally, whenever there exists a regular neighborhood of an elliptic invariant point, 
all of the jnoperties of invariant closed curves established in case IP, l — 1, continue 
to hold. 

It is highly probable that an integer analogous to l in the case II' can be 
defined in all elliptio cases and that, if the notion of regular neighborhood be 
generalized appropriately, such a neighborhood exists when l is finite. When 
/ = co it appears possible that an invariant linear family of series 0 + cH 
exists. 

The formal and analytical questions to be answered here are extremely 
important and interesting. 


202 



Surface transformations and their dynamical applications. 


93 


§49. Invariant curves and the function F*. 

In case II', Z —1, the invariant closed curves investigated above ore closely 
like the curves F ■= const, where F is a polynomial in u,v obtained by breaking 
off F * at terms of high degree /«. 

To see this let us employ the variables u, v osculating the normalizing 
variables of § 22 to high order. The formulas (28) and (31) show that the tangent 
directions along the invariant curve in the r«y>*planc have a slope less than 

l r *-\ If the slope exceeds this magnitude the invariant curve will not lie 

within the barred angle at the corresponding point of the invariant curve. 

On the other hand F * is given by — ^ 0 (u » + v *) out to terms of degree/<+ 1. 

Combining these results and observing that p is arbitrary, we find: 

In the stable case II', l — x, «/ F stands for the polynomial in u.v formed by 

the terms of degree less than p in F *, then | F' — F" | < k |F' | 2 at any points P', P" 
of an invariant curve. 

Evidently similar results hold in any case IV, 11 ", IIP, III" when a regular 
neighborhood exists. 


§ 50. Remark on the Integrable elliptic case. 

In the integrable elliptic case there is a family of closed analytic invariant 
curves F* — const, about the invariant point. 

Since an area integral Jj Q(u, v)dudo is invariant under T, an integral 

of the form 

J J J ) dadt 

remains invariant, where a is a parameter varying from curve to curve, and * 
an angular parameter varying from o to 271 as each curve is described. But T 
has the form 

<r, — u, r, — /(r. o) 

so that 

— Pio, t). 


203 



94 


George D. Birkhoff. 


Consequently aloDg any particular curve 


d u, 

J P{a,t)dt= J P{o,,r,)dr l 


Thus, if J P(a, r)dx be taken as proportional to a modified parameter, the 
equations for T take the form 


»/,«€/, r, = r -t g(a) t 


where g is an analytic function of a for a* o. 

A noteworthy special feature of this case now appears: 

In the integrable elliptic case if any invariant closed curve of the analytic family 
F* — const, has a rotation number then T q leaves every point of the curve 

invariant. 

It is obvious that the integrable case is not the general case, inasmuch as 
such an invariant point curve will not exist in general. 

It would be a vital advance to be able to determine the distribution of the 
invariant curves in the non-intcgrable case by analytic tests. This appears to 
be possible only in the case of a rotation number commensurable with ux , when 
the invariant curve is hypercontinuous. 


Chapter IV. Elliptic invariant points. Unstable case. 

§ si. Existence of a and at points. General case. 

In the unstable case a neighborhood of the invariant point (o, o), of the 
form r <d say, can be so taken that, under indefinite iteration of T or of T- 1, 
points arbitrarily near (o, o) leave this neighborhood. It has previously been 
pointed out that this property holds for both T and T-i if it holds for either 
(§ 42). We restrict attention to such a neighborhood D. 

Let us fix attention upon ot points which remain in D under indefinilo 
iteration of T and upon a points which remain in D under indefinite iteration 
of |. The two sets of points are clearly closed sets. 


204 



95 


Surface transformations and their dynamical applications. 

An to point is evidently carried into nn to point by T , and also by T-\ if its 
image under T- 1 lies in D. Similar results hold for a points. 

For an unstable elliptic point the point set of a(to ) points has a connected subset 
A (fl) extending from r = o to the boundary of D. 1 

Take a very small neighborhood r< 6 of (o, o). Under iteration of T—\ N 
times (N large), some point of the image extends out to r — d, by virtue of the 
instability. Within this image we can draw a curve from r = o to r — d which 
remains within D under N iterations of T of course. This curve cuts any closed 
curve about (o, o) in at least one point. By a limiting process, in which N 
becomes larger and larger, we see that at least one to point lies on any such 
curve. Similarly an a point lies upon it. 

It is then intuitively evident that the italicized statement holds, inasmuch 
as a point a lies upon every such closed curve in D which encloses the invariant 
point. 

The following is evident: 

The connected sets A (£ 2 ) are carried into parts of themselves by T-\ ( T ) and 
into all of themselves together with a part outside of D by T (T_i). 

Let us term an unstable invariant point regular if there do not exist closed 
invariant curves in D of which it is a boundary point. 

The regular case embraces tho general unstable elliptic case for l finite 
and indeed any case in which the invariant point i9 surrounded by a regular 
neighborhood. Consider for simplicity the general case II', /—i. Here r »= o 
functions as an invariant curve in the rr/)-plane of polar coordinates. If another 
invariant curve has a point in common with this invariant curve (i. e. with 
r —o) the rotation number is 0 of course and commensurable with 2rr (§ 46), 
and this is impossible in case II' by definition. 

In the regular case the set A (S 2 ) connected with r = o tends uniformly to 
r —o under iteration of T_ t ( T ). 

Suppose if possible that this does not hold for the set £ 2 . There exists 
then a quantity <J > o such that for n indefinitely large the set T n (S 2 ) does not 
lio entirely within r< 6 . Now any point of T n (£ 2 ) is an <„ point which remains 
in D under n iterations of T_, also. Therefore, recalling that S 2 and its images 
are connected with r —o, we see that any curve within r < <J and surrounding 
r “ o has on afc ,eft st one point P which remains in D under indefinite iter¬ 
ation of T and n fold iteration of T_, ( n arbitrarily large). Hence, by a lim- 
i tmg process , we conclude the existence of a point of type to and a on this curve. 

1 That is, a chain of a <or «) point*, each point arbitrarily near its successor, extending 
from r-o tor-d, can be found. 


205 



96 


George D. Birkhoff. 


Thus we arrive at a set of points A SI common to A and SI, and connected 
with r —o, which reaches out at least to r — < 5 . This set is clearly carried into 
itself by T and forms the inner boundary of an open continuum. Thus the set 
AS 2 forms a closed invariant curve within the scope of the definition. 

However, in the regular case for a sufficiently restricted region D such an 
invariant curve does not exist. Thus we have reached an absurdity, so that 
the italicized statement under consideration must be true. 

An obvious consequence of this property is that the content of the set of 
A(S 2 ) points connected with r —o is o. 

We easily infer the following fact to be true: 

In the irregular cast there exists a connected set of points A SI reaching to 
r — o from r = <5 > o if d is small enough. The set A (#) tend uniformly toward 
the set A SI under indefinite iteration of T-i(T). 

Although the introduction of a and w points was not necessary in the study 
of the unstable hyperbolic points, it is instructive to note that the above de¬ 
finitions hold there (and even in the stable elliptic case). In particular the 
to points are the points on the analytic invariant curves tending toward (o, o) 
or at least not away from (o, o) under iteration of T. Similarly the a points 
lie on the invariant analytic curves which tend toward (o, o) on iteration of 
T- t or at least do not tend away from it. Thus the sets A and S 2 are analytic 
curves. 

In general these two sets have only a finite set of points in common and 
the points A [Si) tend uniformly toward (o, o) under iteration of T_i(T). This 
ia the regular case. The irregular case arises when invariant point curves are 
present. These constitute the points A SI. 

Thus our methods in the hyperbolic case have revealed the precise nature 
of the a and to points — these fall along certoin analytic branches ending at 
the invariant point. In the following paragraphs we shall extend the idea of 
branches to the unstable elliptic case. 


§ 52. Further study of the regular case II', l finite. 

Let us confine attention to tbe r«/>-plane of polar coordinates and let us 
fix attention upon any point Q which belongs to the set A(Sl) of points a(w) 
connected with r = o. At least one such point Q lies upon r —d. Let 2 denote 
tbe set of a(to) points connected with Q for r>d>o, where d is an arbitrarily 
small quantity depending on the point of 2 to be obtained. 


206 



Surface transformations and their dynamical applications. 


97 


If a continuum abutting on r = d has the property that all of its points 
are accessible from r — d along regular curves without double points and with 

tangent argument greater (less) than —its boundary will be said to be left- 

handedly (right-handedly) accessible. Thus the curve of the figure below ending 
at Q is right-handedly accessible with MN an auxiliary curve with tangent 

argument less than —With this definition we have the following: 

In the case II', l finite, the continuum of points accessible from r «=~ d along 
a regular curve to the left (right) of the set 2 of a(io) points is right-handedly (left- 
handedly) accessible, and its boundary extends below r — d indefinitely far to the 
left (right) (see figure). 


r-d 



Take first / —i. The proof of the first part of this statement follows the 
line of argument already employed in § 44. If it is not true, there will exist 
above 2 and on its left regions inaccessible from r — d along regular curves 

without double points and with tangent argument less than — ^. Since T-\ 

rotates vertical lines positively such regions will be carried into similar regions 
toward r — o by iteration of T-\. This fact stands in contradiction with tho 
existence of an invariant area integral. 

In particular the point Q is the point furthest to the right of the con¬ 
tinuum so defined. 

If now the second part of the statement fails to hold, the set 2 does not 
extend indefinitely far to the left. Thus we hove a connected set 2 extending 
from r —o to r - d for which tp is limited. But it has been shown that under 
iteration of 1 tho range of variation of tp along such a set increases indefinitely 
(§ 43 )- Hence an image exists which will cross 2 from left to right. Hero we 
allow the use of congruent images. But this shows that 2 must be extended 
further to the left, since it contains all connected a points. This is absurd. 

The properties stated in § 43 extends at once to the general case II', / 

Acta molhfmatico. 41. lapriB* I* M Bars IWO 13 


207 



98 


George D. Birkhoff. 


finite, so that our statement is true in this case also. It is probably true when¬ 
ever a regular neighborhood surrounds the invariant point. 

Definition. An unstable invariant point is branched of a {to) type if a con- 
tinuos curve C from r = d to r = o can be drawn in the ur-plane with no a {to) 
points on C which are connected to the invariant point by a(to) points. In 
the contrary case it is unbranched. 

For the elliptic unstable case II', l finite, of a {to) branched type, the sets 
A ( 12 ) fall into a set of closed connected branches extending indefinitely far to 
the left (right) with lim r = o for lim — — «c (+ <*), but only a finite distance to 
the right (left). 

Consider first the set 2' and its congruent sets in the branched case. These 
divide the region r<d into component continua and their limit points. The 
continuous curve from r — d to r —o in the ui -plane which exists in the branched 
case becomes a continuous curve lying in one of these continua and approaching 
the line r —o. Since each of these continua lies to the right of an initial point 
such as Q this auxiliary curve extends only a finite distance to the right and 
infinitely far to the left, approaching r —o. The analysis situs of the figure 
now renders it clear that each lower boundary curve of one of these continua 
tends uniformly toward r —o as r/» becomes negatively infinite. 

But the set 2 cannot cross the auxiliary curve and its congruent images. 
Hence 2 forms a closed connected set of a points having the properties specified 
for the branches. 

Two a points A and B connected with r —o through a points will be said 
to belong to the same a branch if no auxiliary curve C can he drawn between 
these points to r —o. Otherwise two such points belong to different branches. 
If B lies to the right of C and A to the left then we will say that the B branch 
lies to the right of the A branch. 

A branch clearly includes all a points connected with one of its points for 
r > <J > o. 

It is apparent that if the A branch lies to the left of the B branch and 
the B branch to the left of another branch, then the A branch lies to the left 
of this branch also. Thus there is a cyclic ordering of the branches. 

The existence of a single auxiliary curve C ensures the existence of an 
infinitude of distinct branches, occurring in congruent sets. Such branches have 
clearly the form specified, inasmuch as these lie between congruent curves C. 

By the transformation T any a branch is carried into a part within r<d 
and certain other portions outside. However, there is clearly an a branch of 
the image wholly within r<d. If there were more than one, these branches 


208 



Surface transformations and their dynamical applications. 

together with the parts outside of r<d would enclose an area, and this area 
would tend toward r =-o upon iteration of T. There must then be only one 
image branch under T. 

By the transformation T- * an a branch is evidently carried into a part of 
such a branch or all of. it. 

Similar remarks hold for the u branches. 

In the branched elliptic unstable type II', l finite, the transformation T (T_,) 
carries an a (to) branch into such a branch (as specified). The cyclic order of the 
branches is preserved. 

The last part of the statement is obvious and the first part has just been 
proved. 

It is clear that we have associated with a branched point an a and to 
rotation number, say 0 a and 0 „, indicating the rotation of the branches. 

The identity of the branches in no way depends on d. Upon iteration by 
T each branch is carried into r <d where S is arbitrarily small. 

Thus 0 a and in no wise depend upon d. 

For a branched invariant point in the unstable elliptic case II', l finite, the rotation 
number of the a (to) branches is at least (at most) 0. 

First, wo shall prove that if 0 is positive the rotation number 0 a is positive 
or zero. In fact if 0 > o the branch 2 with terminal point Q on r — d goes into 
a branch - 2 f_i entirely to the left of Q. Now 2 -\ cannot lie below 2 for then 
the region made up of points below 2 and to the left of Q is carried by jT-i 
into a region lying under 2 and thus forming only part of the region below 2’ 
and to the left of Q. In the ue-planc wc bave a corresponding area which is 
carried into part of itself, which is not possible. Thus 2 is taken into a branch 
to its left and above it, which shows that U a is positive or zero in this case. 

Consider now the general case and suppose if possible 0 a < 0 . Find an 
integer m such that for an integer k 

mO a < 2 kn < mO. 

Consider the transformation T obtained by following T m by a shift of the piano 
2kit to the left. For T the rotation number of the invariant point is & — 
m0 — 2kn and is positive, while — mO a — 2kn is the rotation number of the 
branches and is negative. But T satisfies all the conditions imposed on T for 
d sufficiently small. Hence wc are brought back to the case proved impossible 
in the first place. 

In the branchless case greater complexity exists. A discussion of this case 
is much to be desired. 


209 



100 


George D. Birkkoff. 


§ 53. Interrelation of a and w points. 

Consider the a and 10 points of D which are connected with r*=o. The 
first set A forms a closed connected set reaching from r = d to r = o. The 
second S 2 has the same properties. In the rr/>-plane it appears at once that the 
two sets must intersect infinitely often. 

Let us develop briefly the proof of this fundamental fact. The basic reason 
which permits this conclusion is thot if we have a and to curves of the type 
2 ' specified in § 52. one to the left of a point Q of r = d and the other to the 
right of a point P of r — d, and if P is taken to the left of Q, there lies 
between P and Q a continuum with boundary points all of type a or w. 
Thus there are points of this closed boundary of both types, i. e. belonging to 
the boundaries of both regions. 

In the branched elliptic unstable case //', l finite, every a branch intersects 
every tv branch infinitely often. 

In the unbranched case also the A and Si sets have infinitely many points in 
common. 

In the branched case then we have what may be described as a network 
of a and tv branches. In the general case it is clear that the A and SI sets 
together separate r —o from r — rf’ >0 for d' sufficiently small. 

The lack of definiteness in our general conclusion for the elliptic case is in 
startling contrast with thot found in the hyperbolic cose. I believe, however, 
that this corresponds to the extremely general character of the situation. A 
fundamental distinction between the two cases is this: the natural domain in 
the hyperbolic case is the complex variable; in the elliptic case, the real. 


Chapter V. Recurrent point groups. 

§ 54. Point groups. 

Consider an analytic closed surface 5 of any genus and for the present let 
T denote any one-to-one, direct, analytic transformation of this surface into 
itself. The problem which we attack is that of determining the behavior of 
various classes of points of S under indefinite iteration of T and T- X . Hitherto 
we have only considered points in the vicinity of an invariant point. 


210 



101 


Surface transformations and their dynamical applications. 

Let P be any point of S and consider the infinite sequence of points 

.... T-AP). T-t (P), P t T(P), TAP )..... 

which will be termed the point group of P. If two members of this sequence 
are the same, say if T a ~T fi , a< 0 , then we have 7 y.«(P)-P. Hero the point 
P will periodically iterate through a set of p — a distinct points under T. Thus 
by considering T/t-* instead of T we may apply our earlier results to the study 
of the points near this set of points under iteration of T. 

The existence of infinitely many point seta of this particular type and of 
special properties may be considered as established by general theorems con¬ 
cerning the invariant points of such surfaces. 1 A set of points of this type 
forms a periodic point grout 

Every limit point of the set P, T(P), TAP). .... will be termed an to limit 
point of P, and every limit point of the set P, T-AP), T- S (P), ..., will be 
termed an a limit point. A point is counted as often as it appears. In the 
periodic case the finite set of points are nil a and to limit points, and there are 
no others. 

In all cases the limit points of either class form a closed point set. 

The set of a (to) limit points of P form a set of complete point groups. The 
distance of T k (P) from this limit set approaches o for lim k — — oo (+ oo). 

Let Q be an a limit point which T k (P) approaches for k — k lt k t . 

Evidently T k +AP) will approach T(Q) at the same time. That is to soy T(Q), 
and likewise T-AQ). are a limit points if Q is. By repetition of this argument 
wo infer that a !1 points of the point group of Q are a limit points. 

To establish the second part of the theorem we employ an indirect argu¬ 
ment. If T k (P) did not approach the set of a limit points uniformly for 
lim k — — oo it would be possible to select an infinite set of negative values of 
k such that T k (P) would be distant from any limit point of P by at least a 
definite positive quantity d. There would then be at least one limit point L 
of this set, and this point would be at least d distant from any a limit point. 
By definition however L is an a limit point, so that a contradiction results. 


§ 55. Recurrent Point Groups. 


Consider now an arbitrary closed set 2 of complete point groups. It wtis 
observed above that the a or u limit points form such a set of point groups. 

' See**my paper first cited. 


211 



102 


George D. BirkhofF. 


More generally, if we take any set of complete point groups and adjoin to it 
the limit points, we obtain an enlarged set 2 . 

If a set 2 contains no proper closed subset 2 ' of the same type we shall 
say that 2 is a minimal set. In this case if P is any point of 2 its a (or to) 
limit points form a closed set which must therefore coincide with 2 . 

Any complete point group in a minimal set forms a recurrent point group. 

The simplest type of recurrent point group is the periodic typo referred 
to above. 

In all cases but this simplest one, in which 2 has only a finite number 
of points, a minimal set 2 consists of a perfect point set. For suppose 
a closed minimal set to have an isolated point. This point is its own limit point 
under T or 2 P_|. Hence this point is a member of a periodic point group, which 
must constitute the minimal set. 

In order that a point group generated by P be recurrent it is necessary and 
sul/icient that for any positive quantity i, however small, there exists a positive integer 
k so large that any k successive points in the point group of P, 

T m (P), Tm+i(P) . 


have representatives within distance e of every limit point o\ P. 

This condition is necessary. 

If not there is a recurrent point group generated by P, and a positive e 
such that sequences of k points (k arbitrarily large) can be found no point of 
which comes with distance e of some limit point Q of P. As k increases the 
point Q has at least ono limit point Q' and thus it is clear that for a properly 

taken set of sequences no point lies within distance * of Q\ Take k odd and 
consider the middle point L of such a sequence. It and its j — i iterates under 


T and j lie at a distance at least ^ from Q'. Consequently for a limiting 
position V of L we infer that every point of the complete point group of L is 


at distance at least 


e 

2 


from Q'. 


Hence L defines a closed set of point groups 


lying within the closed minimal set defining the given recurrent motion, but 
forming only part of it, and in particular not containing Q'. This is absurd. 

To prove the condition sufficient we note first that the set of a and to 
limit points of a point group satisfying this condition must coincide. We need 
only to take m > o in the arbitrary set to see the truth of this fact. Call the 
set of these common u and to limit points 2 . 


212 



Surface transformations and their dynamical applications. 


103 


If the set 2 is not minimal it would coutain a proper subset 2 ' of the same 
sort to which some point Q of 2 would not belong. Now, when one of the set 
of points P, T(P), T t (P), .... approaches sufficiently near to a point of 2 ' it 
will remain very near to this closed set for an arbitrarily large number of iter¬ 
ations under T, and so will not approach Q\ it is to bo kept in mind that 1 ' is 
a closed set of complete point groups. Thus the assumed condition would not 
be satisfied by the point group generated by P. 

Hence 2 is minimal and the point group of P is recurrent. 


§ 56. The general point group and recurrent point groups. 

The importance of the complete point groups of recurrent type for the 
consideration of the geueral point group is evidenced by the following re.sult: 

There exists at least one recurrent point group in the a (w) limit point group 
of any given point P. 

Let 2 denote the closed set of a limit points. We need to prove that the 
set 2 contains a minimal subset. 

Divide the surface of S into a large number of small regions of maximum 
span not greater than d, an assigned positive constant. Among the points of 
2 there will be one which enters a least set S' of regions of S under indefinite 
iteration of T and T—%, Let 2 ‘ be the corresponding closed set of complete limit 
point groups. This set is part of 2 and lies wholly in the same regions S'. 

Divide S' into regions of maximum span Among the points of 2 ' there will 

be one which enters a least set S" of regions of S' under indefinite iteration of 
T and 7 \_,. Define 2 : ' as the closed set of complete limit point groups, which 
is part of S'. 

Proceeding in this way we determine an infinite sequence 2 ', 2 ",... of closed 
sets of complete point groups lying wholly upon S', S",... respectively. Now let 
pM be any point whatever of on £<»> and let P denote a limit point of the 
set P‘, P", ... The point P belongs to 2 of course since it is a limit point of 
points of 2 . Furthermore, since /*"> is contained in 2 , 2 ', 2 ", ..., the limit 
point P lies on all of the regions S, S'. S',.... Likewise all of its images under 
T or T_, lie on these regions. Thus the complete point group generated by P, 
and its a and aj limit points, do the same. 

Moreover, every point lying in every region S', S", ... is an o and w limit 
point of P. Otherwise for large positive (or large negative) k, T k (P) does not 


213 




104 


George D. Birkhoff. 


approach some point Q in S', S'' . Hence it is apparent that the set of points 

P, T(P),... will not enter into some one of the regions S'*', namely the 
particular one containing Q. But this set of points has a set of w limit point 
groups, each with a representative in every one of the minimum set of subregions 
which make up S lk) . Thus a contradiction results. 

The same argument shows that any point P lying in every region S', S ",... 
has this complete set as its set of a or u limit points. In other words the set 
of points common to S', S ', ... form the desired minimal set. 

The following further result shows that either a point P generates a recurrent 
point group under T or else that it successively approaches and recedes from 
such recurrent point groups: 

For any t> o there exists a k so large that any sequence of k points P, T(P ),..., 
T k {P) contains at least one point roithin distance e of a recurrent point group. 

The proof is immediate. 

If the theorem is not true it is possible to obtain points of this type not 
coming within distance t of any recurrent point group for k arbitrarily large. 
Let then Q denote the middle point of such a set (k being taken odd). If Q 
is a limit point of points Q for lim k — » evidently the complete point group 
generated by Q has none of its points within distance « of any recurrent point 
group. But the set of a and at limit points of Q contains a minimal set. Thus 
a contradiction appears, since every point group in a minimal set is by 
definition recurrent. 


§ 57. Continuous recurrent point groups. 

Recurrent point groups 2 may be classified as follows: if a point P of 2 
exists such that all sufficiently near points of 2 are connected to P through 
2 then P is of continuous type; in the contrary case 2 is of discontinuous type. 

From every standpoint the first type is the simpler. 

There are two extreme types of continuous recurrent point groups, namely 
the zero-dimensional or periodic type in which 2 consists of a finite set of isolated 
points, and the two-dimensional type in which 2 fills an area. But this area 
has no boundary since these boundaries would form a closed subset of point 
groups of the minimal set 2 . Hence this area comprises all of S. Consequently 
8 has no invariant points under T, and so has the connectivity of a torus, at 
least if T can be generated by a deformation. 1 

1 See my paper first cited. 


214 



Surface transformations and their dynamical applications. 


105 


If tp, ip are angular coordinates on a torus and if a, ft are incommensurable 
with 2 rr and with each other, a transformation T of this type is defined by 


9>, — 9> + a, ip, — ip + ft. 


Thus the two-dimensional type exists. The precise structure of this type is not 
here determined further. 

The remaining one-dimensional continuous type arises when some but not 
all of the points of S near P belong to 2 , and are connected with, it through 
nearby points of 2 . 

Thus 2 falls into a set of connected subsets, which undergo some sort of 
permutation under T or 7!_i. Since P is carried into its own immediate neigh¬ 
borhood on sufficient iteration of T, the connected set containing P is carried 
into itself after a finite set of iterations. 

It appears then that 2 consists of a set 2\ containing P, and of its distinct 

successive images 2\,2\ . 2\- x , while 2' k coincides with 2' c . Let us consider 

then T k , which carries 2' 9 into itself, and for which 2' 0 is also a recurrent point 
group. 

Now 2\ is either a simply or multiply connected point set. By using a 
known theorem due to Brodwer' we will prove that it must be multiply con¬ 
nected. For, if not. 2\ forms a simply connected set on a part of S which can 
be represented in the plane, and is invariant under T. Moreover this set has 
no inner point, for the boundaries would then constitute a smaller closed set of 
complete point groups. Consequently by the theorem referred to there exists 
nn invariant point of 2 \, which is absurd. 

Hence tho set 2 \ is multiply connected. 

If S has the connectivity of tho sphere then 2' 9 divides the surface of S into 
two or more parts. But in one of theso there is a point invariant under T by 
another theorem also duo to Brodwer.* Consequently its boundary is invariant 
under T and must constitute all of 2'.. Here then 2 consists of a finite set of 
closod two-sided curves, all outside of one another. 

More generally, consider the neighborhood of a point near 2 but not on it 
and follow along near 2 until a complete circuit is made. Tho boundary so 
outlined is carried into itself or into one of a finite number of similar boundaries 


TT tran, f°™° tion \ °f *" P, acceding, of the Section of 

Science*, Kotunklijke Academic van Wetenechappen te Am,tcrdam. vols. 11-15 (.908-.9.a). In tho 

VTT de K e, ° P9 ,h ° n0,i0n ° f elaU ° f * given later by 

myaolf in tho paper first cited without knowledge of his work. 3 

Atta mathemaliea. 4X Inprlm* I* 23 a an IMO. 


215 







106 


George D. Birklioff. 


under TV Hence 7'*/ carries this boundary and similarly its images under T 2k . 
raft,..., 7\f-it* into themselves. Each boundary is thus recurrent under 7'*/. 
and if two boundaries have any points in common all of their points are in 
common. Since all of the boundaries form a set which bangs together the images 
can consist only in the boundary of a single closed two-sided curve. 

Thus continuous recurrent point groups lie in minimal sets which arc either 
made up (i) of a finite set of points, ( 2 ) of a finite set of closed two-sided 
curves on S, or ( 3 ) of all the points of S. 

In the one-dimensional case a single angular variable and a definite rota¬ 
tion number arise. A fundamental question is whether a similar representation 
in the two-dimensional case, by means of two angular variables and two 
characteristic rotation numbers, is possible. 


§ 58 . Discontinuous recurrent point groups. 

An imraediato division of the types of discontinuous recurrent point groups 
is possible, in the first case no point P of the minimal set 2 is connected with 
any other point through 2\ this is the totally discontinuous type. In the second 
this is not the case; here we have the partially discontinuous type. 

For the second case 2: falls into connected sets which are permuted among 
themselves by T just as the points are in the first case. The existence of this 
second category of recurrent point groups is doubtful when T has the properties 
which we have assumed. On the other hand the totally discontinuous type of 
recurrent point groups exists in important cases. 

Inasmuch as analytic weapons are lacking we content ourselves merely 
with some examples and with making an attempt at classification in the totally 
discontinuous type. 

Let /(/) be a continuous increasing function of such that /(/)—/ is periodic 
of period 2 ;r. Then f, — /(*) defines a one-to-one continuous direct transformation 
of a circle (on which t is an angular coordinate) into itself. This is associated 
with a definite rotation number 0 and defines at least one recurrent point group 
on the circle, which need not coincide with the whole circle. Its minimal set is 
represented by a perfect nowhere dense point set on the circle . 1 We limit 
attention to the corresponding values of /. 

1 Sou G. 1). Birkhoff. Quclqutt thiorimes tur le momcment des syil'emrt dyiiattiiqucs, Bulletin 
•U la Soriele Mathematique de France, vol. 40 . >9>2 

The render will obsorvo the complete analogy between the recurrent motions of that 
paper an«J recurrent point groups. 


216 




Surface transformations and tlieir dynamical applications. 


107 


It may now be possible to represent the given recurrent point group 
in the form 

“ —9>(0. u, = tp{t x ), 

where tp, tp are continuous functions, where u, v are ordinary surface coordinates 
for S, and where t ranges over the values specified. We shall say that the 
recurrent point group is of rank i in this case. 

Or it may be possible to write 


u=.r/>«, w), v = *P(t, u»); tc,), u;,), 

where w has properties analogous to t. We then say that the recurrent point 
group is discontinuous of rank 2 . 

This definition obviously extends to any rank and is applicable to partially 
as well as totally discontinuous point groups. 

It would be interesting to know whether or not the rank is finite in al 
cases which actually arise in applications. 


§ 59-. Unstable recurrent point groups. 

Lot us term a recurrent point group and its minimal set 2* unstable if, for 
6 >o sufficiently small, it is impossible to find d such that points P within 
distance d of ^ remain within distance e of under indefinite iteration of T and 
TLi. In the contrary case let us call the point group and the sot 2 stable. 
This agrees with our earlier definition of stability in the case of an invariant 
point. 

Let 1 be an unstable minimal set and P a point group such that the 
sequence of points P. T(P) t T,(P).... has S as the only minimal set in the set 
of «* limit points. Then the point group of P will be said to be positively asymp¬ 
totic to 2 . Similarly if the sequence P. T_,(P). T_,(P). ... has a single minimal 
set 2 in its set of <t limit points then the point group of P will be said to be 
negatively asymptotic to 2 . 

It is apparent that we cannot have the phenomenon of asymptotic point 
groups save when ^ is unstable. For. if P is any point at distance more than 
« from a stable set v, its iterates cannot approach to within some distance d 
of 2 by the definition of stability. Moreover our earlier work shows that for 


217 



108 


George D. Birkhoff. 


hyperbolic periodic point groups 1 such asymptotic point groups lie along hyper- 
continuous branches, while for regular elliptic periodic point groups other types 
of asymptotic point groups are present. In both of these cases the point P 
tends toward 2 asymptotically, under T or T-i, although such a state of affairs 
is not required by our definition. 

In the regular case an unstable periodic point group possesses positively and 
negatively asymptotic point groups forming connected sets of the kinds earlier 
specified. 

Moreover even in the irregular case the work of § 51 shows that we will 
have connected a and w sets. These furnish asymptotio point groups unless 
there are other recurrent point groups in these sets. This follows by the last 
result of § 56. 

In the irregular case an unstable periodic point group possesses such asymptotic 

sets unless there are infinitely many recurrent point groups in its infinitesimal 
vicinity. 

It is this possibility which arises for a hyperbolic invariant point through 
which passes an invariant point curve. The nearby invariant points are the 
recurrent point groups in the vicinity. 

Our initial conclusion for recurrent non-periodic point groups is the 
following: 

An unstable minimal set (not periodic) possesses positively and negatively 
asymptotic point groups forming a connected set, at least unless there are other 
recurrent point groups in its infinitesimal vicinity. 

In fact, if possible choose e so small that there are no other recurrent point 
groups within distance e of 2 . Now choose d extremely small and consider the 
iterates of points within distance 3 of 2 under T. Because of the instability 
of 2 these iterates reach 6 \it in a connected set to distance e in N iterations 
(AT large). By a limiting process like that employed in § 51 we infer the existence 
of a closed set of points connected with 2, reaching out to the boundary of this 
€ vicinity, and remaining within this neighborhood under indefinite iteration of 
T-\. But each point of this set has only the minimal set 2 in its a point group. 
Hence these points approach 2 uniformly often under iteration, by the last result 
of § 56, and are negatively asymptotic to 2 . The existence of a positively 
asymptotio set may be similarly established. 

To advance further we introduce the notion of isomorphic recurrent point 
groups: Two recurrent point groups with minimal sets 2 , 2 ' are isomorphic if it 

• A periodic point group of q points P. 7 \P) . Tq—t(P) is called hyperbolic it P ie 

hyperbolic under Tq. A similar terminology ie employed in general. 


218 



Surface transformations and their dynamical applications. 109 

is possible to establish a correspondence of closed point sets of 2 to closed point 
sets of 2' which is maintained under T. It is assumed that there is more than 
a single set unless 2 or 2' consists of a single point. Thus two periodic point 
groups of k and l points are isomorphic only if k and l have a common prime 
factor. Similarly two one-dimensional continuous recurrent point groups are 
isomorphic only if their rotation numbers are the same or if they fall into k 
and l curves, where k and / have a common prime factor. 

If there are not an infinitude of recurrent point groups in the neighborhood of 
2 and isomorphic with it, there will exist such connected asymptotic sets. 

The existence of infinitely many near by recurrent point groups is an 
evident necessary condition for the non existence of asymptotic sets of this 
description. To show that infinitely many of these are isomorphic with 2, 
we note that the earlier argument for existence of such positively asymptotic 
point groups only fails if the connected w set obtained contains other minimal 
sets besides 2. Let 2' be such a set. By operating with T indefinitely often 
upon the set connecting 2 ; and 2 ' we infer that there exist point sets connecting 
2 and 2 ', and remaining in the c neighborhood of 2, 2' under indefinite itera¬ 
tion of T and of T- j. Let us establish a correspondence between the sets of 
points of 2 and 2 ' so connected. 

Now if all the points of 2 and 2' are so connected we have a connected 
invariant act under T, and included by it an invariant point of course. If 
invariant points exist in every vicinity of 2. there exists an invariant point 
on 2, which must coincide with 2. Hence in all cases the sets 2 and r 
are isomorphic. If there arc a finite number of connected sets we are led to 
isomorphic periodic point groups near 2 . 

By letting e approach o we arrive at infinitely many periodic or other 
recurrent point groups having minimal sets isomorphic with 2 and lying in its 
immediate vicinity. This is under the hypothesis that there are no asymptotic 
sets of the type described. 

It is to be hoped that a more complete analysis of the notion of isomor- 
phism will be made. 

Let us say that a point is positively (negatively) asymptotic to a set of iso¬ 
morphic recurrent point groups if these and these alone form the recurrent 
point groups among its w (a) limit points. 

The above argument then enables us to state the following- 

For a given recurrent point group in any continuum D there exist connected 
sets positively and negatively asymptotic to a set o/ isomorphic recurrent point groups 
containing the given point group unless there is such an isomorphic point group with 
a point on the boundary of D. 


219 



110 


George D. Birklioff. 


§ 60 . Stable recurrent point groups. 

The simplest type of continuous recurrent point groups is the periodic type. 
If this is stable each of the k points of the group is clearly surrounded by 
infinitely many neighboring curves which are permuted by T. These curves are 
invariant under T and their form has been partially determined (§§ 44 — 47 ). 

The two-dimensional continuous type is stable by definition since its points 
fill S. 

Suppose finally that we have a stable continuous one-dimensional recurrent 
point group with minimal set 2. On either side of the curve 2 it is readily 
inferred (see § 42 ) that we have an infinite succession of neafby invariant curves. 
If the rate of rotation of nearby points exceeds that along tbe curve (as in the 
case of a regular neighborhood of an invariant point) the nature of these curves 
can be discussed more fully, but we will not attempt such a discussion. 

Thus a stable one-dimensional continuous recurrent point group is sur¬ 
rounded by infinitely many neighboring invariant curves on either side. 

In the case of a discontinuous recurrent point group with minimal set 2 
we are led similarly to a set of nearby invariant sets of continua containing 

the set 2 as inner points and lying within distance e of 2. Clearly 

taken over aqy of these continua is the same, so their number is finite, and 
they are carried into themselves by T*. Thus there is an invariant point under 
Th within each of theip. Such a point P lying near a point of 2 and in the 
same continuum clearly remains nearby under iteration of T or T-\. By letting 
e decrease the number of these continua increases indefinitely. At each stage k 
is unaltered -or changes to a multiple of itself. 

A stable periodic point group of k points has in its neighborhood infinitely 
many invariant sets of k enclosing curves as specified. 

A stable one dimensional recurrent point group has in its neighborhood infin¬ 
itely many invariant rings within which it lies. 

A stable discontinuous recurrent point group has in its neighborhood infinitely 
Z/iany periodic point groups and invariant sets of enclosing curves. A point of a 
periodic point group approximates uniformly to any nearby point P of the given 
group under all iterations of T and T-\. 


//• 


Qdudv 


220 



Surface transformations and their dynamical applications. 


Ill 


Chapter VI. The general point group. 

§ 61 . Classification of transformations T. 

Before entering upon further discussion of the behavior of points under T, 
we shall effect a classification which is fundamental. 

A transformation T will be called transitive if, for any pair of points P 
and Q on S nearby points P' and Q' respectively can be found such that 

Q' “ T n {P'). 

A transformation T is intransitive in the contrary case. 

It seems highly probable that the transitive case is to be regarded as the 
general case. 


§ 62 . The transitive case. 


We commence with the transitive case. 

In the transitive case all of the recurrent point groups are unstable. 1 

In fact it has been observed earlier that a stable recurrent point group 
leads to continua forming part of S, which aro invariant as a set under T and 
lie near the point group. Hence if we take a point P outside of these continua 
and a point Q within one of them, the condition given in the definition of transi¬ 
tivity cannot be fulfilled. 

Wo note that invariant sets of continua cannot exist in the transitive case 
for the same reason. » 


In the transitive case the asymptotic a or to point groups connected with any 
recurrent point group and its isomorphic recurrent point groups, together with these 
recurrent groups, are everywhere dense throughout the surface S. 

For suppose that there is no such asymptotic point in some small region a 
for a recurrent point group with minimal set 2 \ 

Take then a small vicinity of 2 and consider the regions into which it goes 
by T. Evidently this set of region must ultimately overlap part of a or we 
shall be led to invariant continua, such as can not exist in the transitive case. 

Applying then precisely the same considerations that we l.ave used earlier, 
.. e. considering smaller and smaller neighborhoods of we derive the existence 


' The exceptional case in which there is a 
set fills S is left out of consideration. 


single recurrent point group whose minimal 


221 



112 


George D. Birkhoff. 


of a connected a set reaching from 2 to the boundary of a at P. Either P 
belongs to a point group isomorphic with 2 , or its point group is positively 
asymptotic to 2 , or to a set of isomorphic recurrent point groups, by the pre¬ 
ceding paragraph. 

In the transitive case any positively asymptotic connected set of points has in¬ 
finitely many points in common with any negatively asymptotic set, at least if there 
exists a single elliptic periodic point group IV with l finite. 

This follows at once from the immediately preceding propositions and from 
the structure of the network of asymptotic sets A and S2 about such an in¬ 
variant point (§ 53 ). 

For, consider the transformation T q which leaves such a point P of an 
elliptic periodic point group unchanged. 

The given connected asymptotic a set reaches into this network indefinitely ‘ 
near to the invariant point P without meeting the A set. The negatively con¬ 
nected asymptotic to set reaches into this network without meeting the S2 set. 
Consequently the two sets have infinitely many points in common. 

Thus there exist point groups positively and negatively asymptotic to as¬ 
signed periodic point groups. 

Suppose now that we designate any point whose a or to limit points do not 
form all of S as a special point. All of the points belonging to recurrent point 
groups or points asymptotic to such point groups arc of this type. 

Points which are not special evidently pass into the neighborhood of all 
points of S' under iteration of T or T-\. Such points we term general. 

In the transitive case the general points are everywhere dense in S. 

To see this we divide S into a large number of regions S' of small diameter 
d, and consider tho set of points P whose iterates do not enter within all of 
the regions S'. Such points P evidently form a closed set of points, M say. 

This set 3/ is nowhere dense in 5. In the contrary cose suppose M to 
fill a small region o'. Now there are only a finite set of regions S' and thus 
only a finite number of combinations of less than all of them. Divide the points 
of a' into the finite number of closed sets according to the regions S' which the 
points enter. Thus o’ is divided into a finite number of closed sets, at least 
one of which therefore fills some neighborhood a" of o' densely. We recall 
that a finite or denumerably infinite set of nowhere dense closed sets cannot 
fill a complete neighborhood. But the existence of such a region a" contradicts 
the condition that T is transitive. Thus 3/ is nowhere dense. 

Again choose a set of subregions S" of the regions S' of diameter less than 


222 



Surface transformations and their dynamical applications. 


113 


- leading to a set M' which includes M by a similar process. 
2 

nowhere dense. 


The set M‘ is 


By continuing in this fashion we get an infinite set of closed sets M, M' . 

each containing its predecessor. Every point P which has not all of S for its 
set of limit points evidently belongs to some one of these sets. 

But by the theorem quoted abovo the set of all points belonging to 
some M (k) nowhere fills a complete neighborhood. Hence the stated pro¬ 
perty holds. 

It would appear to be a very important and difficult question to determine 
the relative measure of the special points and general points. The above argu¬ 
ment renders it clear that both of these sets arc measurable in the sense of 
Lesbesgue, but sheds no light on their relative measures. One naturally con¬ 
jectures that the special points are of measure o. 


§ 63 . The intransitive case. 

In the intransitive case there exists at least one pair of points P, Q such 
that no point very near to P goes into a point very near to Q under iteration 
of T or 1 . Obviously this state of affairs implies the existence of invariant 
sets of two-sided curves forming the boundaries of open continue on opposite 
sides of which P and Q lie. 

Wc term a transformation T for which there exist only a finite number 
k >° °f 8,,ch curves finitely intransitive; otherwise, infinitely intransitive. 

Within one of the invariant sets of continua bounded by these curves in 
such a finitely transitive case, the condition for transitivity is satisfied i. e. for 
any pair of points P, Q withio, nearby points P', Q' respectively can be found 
such that G' — T n (P') for some n. 

In the finitely intransitive case the theorems stated lor the transitive case hold 
within each invariant set of continua. 

The infinitely intransitive case obviously includes the integrable case when 
the points move along analytic curves. More generally, it includes the case 
when there is at least one stable recurrent point group. Indeed it seems pos¬ 
sible that the existence of such a stable recurrent point group is a necessary as 
well as a sufficient condition for infinite intransitivity. But wc have not been 
able to establish this conjecture. 

Ada mulhtmnlita. «. I,»p,im« i e U |»*. 


223 



114 


George I>. Birklioff. 


In order to satisfactorily describe the point groups and their interrelations 
in the intransitive case it is essential to know the possible types of invariant 
sets of curves. Lacking such information save for the neighborhood of a peri¬ 
odic point group of elliptic type II', / finite, we do not attempt to go further. 


Chapter VII. Dynamical applications. 

§ 64. The equations of motion. 

For definiteness we consider a dynamical system with equations of the 

form 

. . dOL OL d OL OL 

(32) did ** *r-°’ dtdy'-dy-° 


when L is a function of the two coordinates x, y and their time derivatives 
y'. This differential system is of the fourth order. If then we regard x, y, 
x\ y' as the coordinates of a point in four-dimensional space the motions of the 
dynamical system are represented by a set of curves, one through each point 
of the space. 

Now we have the well-known integral relation 

<33) *'5? + s ' , ^" const - 


Hence these curves lie on oo‘ three-dimensional manifolds. We fix attention on 
any one of these. 

We assume this three-dimensional manifold to lie in the finite part of the 
four-dimensional space and to ho without singularity. 


§ 65. Periodic, motions. 

To periodic motions correspond closed curves of the three-dimensional spread 
above obtained. 

Suppose we take a point P of such a stream line and consider a small 
element of an analytic surface containing P and cutting the closed curve at an 
angle not o. If we take any point A on this element near to P, and follow 


224 



Surface transformations and their dynamical applications. 


115 


along the unique curve through it in the sense of increasing time /, the element 
will be crossed again later at a point Q. The transformation of the element 
which takes P into Q is the transformation T which we shall consider. 

The conservative transformation T 1 thus defined is clearly essentially in¬ 
dependent of the particular surface element employed, since any other trans¬ 
formation so obtained can be derived from T by a proper change of variables. 

We classify the periodic motions into types I', I". II', II", II"', III', III" 
according as T is of such a type (§ 2). We define a periodic motion to be 
elliptic or hyperbolic according as the transformation T is elliptic or hyperbolic; 
and the integer / is similarly defined. The periodic motion is stable if nearby 
motions remain nearby for all t. This means that T is stable. In the contrary 
case the periodic motion and T are unstable. 

Finally wc will term the dynamical problem integrable if T is integrablc. 

§ 66. The integrablc case. 

In the integrablc case T leaves a family of curves F* -■ const, invariant. 
Thus in the three-dimensional representing space there is a one-parameter ana¬ 
lytic family of surfaces in the vicinity of the closed curve representing the 
periodic motion, each surface being made up of curves of motion. If this mo¬ 
tion is of elliptic type there is a family of closed annular surfaces of which the 
curve of motion forpis a degenerate member. If this motion is of hyperbolic 
type these surfaces arc open and the curve lies on one or more of them. This 
much is obvious. 

The necessary and sufficient condition for integrability of the dynamical problem 
is the existence of an integral relation G{x, y, x\ y') - o where G is analytic in its 
indicated arguments, and where G = o is not an identity in virtue of the known 
integral relation. 

The condition is evidently sufficient. This relation yields an invariant 
family of surfaces in the three-dimensional representing space and these cut the 
surface elements used to define T in a family of invariant analytic curves in the 
vicinity of the invariant point under T. Consequently T is integrable. 

Conversely, if T is integrable we obtain an analytic family of surfaces 
on which the curves of motion lie. These may be represented in the form 
— o, where u, r are coordinates for the surface elements and //> is 
an angular coordinate. Also H is analytic in its three variables. On account 
of the fact that the four-dimensional manifold under consideration consists of a 

1 See iny paper lirsi cile.1. 


225 



116 


George D. Birkhoff. 


one-parameter analytic family of tbe three-dimensional manifolds which we have 
under consideration, it is apparent that these variables u t v,(p may be expressed 
as analytic functions of x, y, s', y'. By this means a relation of the desired type 
is obtained. 

In the hyperbolic intertable subcase there exist k > o one-parameter analytic 
families of motions asymptotic to the given periodic motions for lim t — ± <o (or 
else periodic) whose analytic representation we will not specify . 1 All other nearby 
motions first approach and then recede from the periodic orbit. 

This conclusion is an immediate consequence of the form of T near a hyper¬ 
bolic invariant point. 

In the elliptic integrable subcase nearby motions have coordinates x,y re¬ 
presentable as analytic functions of variables e Y ~~ lat , «*'->**, while t = cr + another 
function of this type. 

The curve of motion lies on a torus and a point on such of curve increases 
its angular coordinates by a fixed amount as a single circuit of the torus is 
made (§ 50). Evidently an analytic distortion takes this torus into an ordinary 
right circular cylinder on which the curves of motion are the spirals making a 
fixed angle with the generators. Now x, y can be expressed as periodio ana¬ 
lytic functions of the angular coordinates p, q on this torus. But this estab¬ 
lishes the stated form of representation for x t y. Also rfr is a similar func¬ 
tion of p,q t whence the form of the expression for t in terms of r. 


§ 67. Formal series in the non-integrable case. 

Evidently the results of the first part of the present paper may be inter¬ 
preted as results for the formal series representing the motion in the dynamical 
problem. The asymptotic validity of such series can be readily established. 
We will only remark upon the following fact: Inasmuch as there exists a for¬ 
mally invariant series F* in the non-integrable case (§ 10), there exists always 
a formally invariant integral relation of the type O — o considered above. Thus 
the dynamical problem is ’formally integrable’ in the vicinity of an elliptic or 
hyperbolio periodic motion. In the elliptic case this means that periodic power 
series in two periods may be employed to represent nearby motions. 

If I am not mistaken it has never yet been demonstrated that integrability 

* In the simplest and general case I\ x,y may be expressed as convergent power series 
in p*-’ while we have f—cr+ another power series of the same sort. 


226 



Surface transformations and their dynamical applications. 


117 


in the above sense cannot always prevail, although such a possibility appears 
remote. Poincare has merely shown that integrability does not exist uniformly 
throughout certain domains with variation of a parameter ft . 1 The particular 
example of § 31 yields a non-integrable conservative transformation, but it is 
not yet established that such a transformation arises in a dynamical problem. 


§ 68. Periodic motions in the non-in tegrable case. 

The results of Chapter II when interpreted in the general hyperbolic case 
show at once: 

The results stated above for the integrable hyperbolic case hold also in the non- 
integrable case, the analytic families 0/ asymptotic motions being representable by 
means of hypercontinuous functions. 

Interpreting the results of Chapter III for the stable elliptic case II', l fi¬ 
nite, we conclude: 

In the non-integrable stable elliptic case IT, l finite, there exist an infinite 
number of continuous closed one parameter families of nearby motions, representable 
by means of continuous biperiodic functions of limited variation and which are in¬ 
variant as a family upon a circuit of the periodic motion. 

In the unstable elliptic case IT, l finite, there exist connected families of asymp¬ 
totic motions for both lim t - — 00 and liml-+co, each containing the given peri¬ 
odic motion. The family of the one type has infinitely many doubly asymptotic 
motions in common with any family of the other type. The motions not in any 
such family are everywhere dense near the periodic motion. 

As before we omit details. 

§ 69. Surfaces of section. 

In very many if not in all cases an analytic surface of section S in the 
three-dimensional spread representing tho motions may bo found with the pro¬ 
perty that it is cut by every curve of motion in one and the same sense and 
has boundaries formed by closed curves representing periodic motion. 

By following along a curve of motion from a point P of such a surface 
to the next point Q, in the sense of increasing time a transformation T for 
which Q— T{P) is defined. This transformation is one-to-one, analytic and 
conservative. 

We consider the totality of mot ions by the aid of such a surface S. 

11. Poiwark, Let mrthodes nom-elles de la meconique celeste, vol. i, Paris 1892. Chap. 5. 


227 



118 


George D. Birkhoff. 


§ 70. Recurrent motions. 

A recurrent motion may be defined as one which comes arbitrarily near all 
its phases during any sufficiently large interval of time (from / — — c© to 
/=+»). Evidently such a motion corresponds to a recurrent point group 
on S. Hence we find: 

Every motion has at least one recurrent limit motion lor lim t — + a> (and for 
lim < — — «>). It recurs uniformly often arbitrarily near some one of these limit 
recurrent motions (not necessarily the same one). 1 

Recurrent motions may either be periodic, biperiodic (representable on a 
square or torus) or triperiodic (representable in a cube), or discontinuous.* We 
will not follow out the classification suggested by §§ 57 * 58 further. 

§ 71. Asymptotic motions. 

A motion will be said to be positively (negatively) asymptotic to a recurrent 
motion if it has only this recurrent motion as a limiting recurrent motion for 
lim < — + 00 (lim t — —00). Furthermore we will say that two recurrent motions 
are isomorphic if the corresponding point groups are isomorphic. The direct 
application of the results of § 59 gives then: 

Unless- there are infinitely many nearby isomorphic recurrent motions, any 
recurrent motion has connected families of motions asymptotic to it for lim < *■ + 00 
and for lim t — — ». 


§ 72. Transitive and intransitive systems. 

If a motion can be fouod passing from nearly one prescribed phase to any 
second prescribed phase the dynamical system is transitive. Here T is transitive 
abo, and conversely (§ 62). Otherwise the dynamical system is intransitive. 

In the transitive case the motions asymptotic in either sense to a given recurrent 
motion or set of isomorphic recurrent motions, together with these motions, are every¬ 
where dense. 

Infinitely many motions exist doubly asymptotic to any two prescribed recurrent 
motions (or isomorphic sets of such motions) for lim t — ± co, at least if there exists 

1 See my paper last cited. 

* The existence of recurrent motions of discontinuous type has been established by H. 
C. M. Mursi, Certain types of geo.lesie motion on a surface of negative curvature. Harvard Dis¬ 
sertation, 1917. 


228 




Surface transformations and their dynamical applications. 


119 


a single periodic motion of the elliptic type II\ l finite. There exist also a dense 
set of general motions which apjrroach every possible phase arbitrarily closely for both 
lim / = + co and for lim t — — e©. 

The intransitive case includes the integrable case. The simplest possibility 
is the finitely intransitive case when the curves of motion fall into k > o types 
filling out regions in the three-dimensional manifold. This corresponds exactly 
to the finitely intransitive type of transformation T . 

In the finitely intransitive case each type of motions has the same properties 
stated above for the intransitive type. 

Wo do not consider the infinitely intransitive type of dynamical system 
except as covered in the general results stated above. Here we have infinitely 
many types of motion, and, in default of a knowledge of the types which may 
exist, the results to be obtained are necessarily vague. 

§ 73. Conclusion. 

The varying degree of definiteness of the results above obtained for dynam¬ 
ical systems is striking. The catalogue of types of motion according to their 
degree of simplicity appears to run as follows: ordinary periodic motions, bi- 
periodie motions representable analytically by convergent trigonometric series 
in two arguments, triperiodic motions representable by three arguments; motions 
asymptotic to periodic motions of hyperbolic type, motions asymptotic to peri¬ 
odic motions of elliptic type and of the other types just referred to; recurrent 
motions of biperiodic or triperiodic type (not representable by convergent 
trigonometric scries); recurrent motions of discontinuous type; motions asymptotic 
to recurrent motions of these new types (or to sets of isomorphic recurrent 
motions); special motions (i. e. not passing near all phases for both lim / — + c© 
and lim< —— 00) not of above types; general motions. 

The degree of definiteness attained has varied with the analytic instruments 
at hand, and will probably be found to correspond to the nature of the case, 
at least unless entirely new analytic instruments are discovered. 

The remarkable diversity and complexity of structure possible in dynamical 
systems with two degrees of freedom is likely to stand permanently in the way 
of approach to any definitive form for the theory of such systems. As has 
appeared above, many of the most vital questions are still without an answer. 
Progress with these questions and progress with the theory of the conservative 
transformations T which we have studied will go hand in hand. 


229 



BULLETIN 

OF THE 

NATIONAL RESEARCH COUNCIL 

Vol. 4. Part I SEPTEMBER. 1922 Number 19 


CELESTIAL MECHANICS 


Report of the Committee on Celestial Mechanics of the 
National Research Council* 


By 

E. W. Brown, Professor of Mathematics, Yale University, Chairman; 
G. D. Birkhoff, Professor of Mathematics, Harvard University; 

A. O. Leuschner, Professor of Astronomy, University of 
California; H. N. Russell, Professor of Astronomy, 

Princeton University 


CONTENTS 

Part I. The Solar System.. 2 

Part II. Celestial Mechanics as Applied to the Stars... 9 

Part III. The Theory of the Problem of Three or More 

Bodies. 17 


REPORT OF THE COMMITTEE ON CELESTIAL MECHANICS 

Celestial mechanics, broadly interpreted, is involved in practi¬ 
cally all the astronomy of the present time. The limited meaning 
of the term now usually adopted refers only to those problems 
in which the law of gravitation plays the chief or only part, and more 
particularly to those which deal with the motions of bodies about 
one another and with their rotations. This limitation will, in any 
case, be adopted in this report since surveys dealing with other 
aspects* of astronomy have been written or are contemplated. It 
is, however, necessary not to be too rigid about the border lines, 
especially in considering questions where the gravitational action 
does not fully account for the observed phenomena. 

The report has three general divisions: I, the solar system; II, 
the stellar system; III, the theoretical aspects of the general prob¬ 
lem of three or more bodies. It is not intended to contain a com- 

• The membership of the committee is: E. W. Brown, chairman, G. D. Birkhoff, 
A. O. Leuschner. H. N. Russell. 


230 





2 


REPORT OF COMMITTEE ON CELESTIAL MECHANICS 


plete account of the present status of the subject. More empha¬ 
sis has been laid on those portions which are under discussion at 
the present time and on problems, at present unsolved, which need 
discussion and solution. 


PART I. THE SOLAR SYSTEM 


The order of treatment is as follows: The moon, the eight major 
planets, their satellites other than the moon, the asteroids or minor 
planets, comets. The general view of the subject now as in the 
past, has been to consider the consequences of the law of gravita¬ 
tion, the extent to which it accounts for the observed motions— 
leading to the discovery of other possible influences—and predic¬ 
tion for future observation and comparison with theory. 

The Moon. The gravitational motion has been worked out 
sufficiently to satisfy all observational needs of the past and prob¬ 
ably of some centuries in the future, and the results are fully em¬ 
bodied in tables constructed to furnish the moon’s position without 
excessive labor. The observational data are the daily Green¬ 
wich observations (weather permitting) since 1750, isolated series 
of observations, eclipses, occultations since the beginning of the 
sixteenth century, and occasional ancient records of eclipses and 
occultations during the past forty centuries. These have led to 
the establishment of the following differences from a purely gravi¬ 
tational theory: 

(a) An apparent secular acceleration of the moon’s mean mo¬ 
tion of about 4 "5* per century, per century, combined with an ac¬ 
celeration of the earth's mean motion about the sun ("acceleration 
of the sun”) of a little over 1" with probable errors, according to 
Fotheringham, 1 to whom the latest figures are due, of about =t=0".5. 
The former has frequently been attributed to a slowing down of the 
earth’s rate of rotation due to tidal friction: the new work of 
G. I. Taylor 2 and H. Jeffreys 3 has rendered this explanation very prob¬ 
able both qualitatively and quantitatively, especially as it also 
accounts for most of the sun’s acceleration. 


(b) A long-period term of some 275 years period and 13" ampli¬ 
tude in the mean longitude, obtained from observations extending 

• This is the coefficient of t« in the expression for the longitude generally mis¬ 
named the acceleration. The true acceleration is therefore twice this amount. 

1 M. N. R. A. 5 .. 80 , p. 581 . 

* Phil. Trans. R. S. , 220, p. 1. 

3 Ibid., 221 , p. 239 ; M. N.R.A. S.. 80 , 309 . 


231 



REPOR T OF COMMITTEE ON CELESTIA L MECHA NICS 3 

over about the same time. Numerous hypotheses have been ad¬ 
vanced to account for this deviation, but none of them rest on any 
secure physical basis nor have they received independent testimony. 1 

(c) Fluctuations which are evident in the observations of the 
past 170 years and well defined during the last 70 years. In the 
former time their principal period seems to have decreased from 
some 70 years to about 40 years, with an amplitude of some 3" or 
4". In 1914 E. W. Brown 2 pointed out that similar fluctuations of 
much smaller amplitude could be traced in the motions of the earth 
and of Mercury: these fluctuations were confirmed by Glauert 3 
who found them also in the longitude of Venus. The latter also 
showed that they could all be moderately well accounted for by 
changes in the rate of rotation of the earth. No cause is assigned 
for these changes and their magnitude, amounting sometimes to 
as much as a loss or gain of 0* .07 in one year in the rotation of the 
earth considered as a clock, makes the acceptance of the hypoth¬ 
esis difficult. It hass been uggested that this hypothesis might be 
tested by observations of the eclipses of Jupiter’s satellites, which 
at present seem to furnish the only possible means for the purpose. 

The Major Planets. The theories and tables by Newcomb and 
Hill seem to satisfy present needs, except perhaps those of Jupiter 
and Saturn, into which some small errors have crept but are now 
in process of examination by Innes. The sufficiency of the adopted 
theories is well shown by the theoretical and observed secular 
changes of the perihelia, nodes, eccentricities and inclination. The 
only large outstanding difference, that of the perihelion of Mercury, 
is fully accounted for by Einstein's addition to the Newtonian law, 
although one or two others need to be kept in mind as being per¬ 
haps in excess of their actual errors. It may be mentioned that the 
Einstein addition causes an increase of about 2" in the centennial 
motions of the lunar perigee and node 4 but this is just at the limit 
of accuracy of Brown’s theory and probably much beyond detec¬ 
tion by observation for many decades to come. 

Attempts made to discover a supposed trans-Neptunian planet 
by its perturbations on Neptune or Uranus 6 have been unsuccessful. 
The errors of the latter though considerable when the tables of Lever- 

1 See E. W. Brown, Amer. Jour. Sc., Ser. 4, 29, p. 529. 

* Brit. Assoc. Report, 1914. p. 320. 

1 M. N. R. A. S., 75, p. 489. 

4 dc Sitter, M. N. R. A. S.. 77, p. 172. 

§ P- Lowell. Lowell Obs. Trans., 1. 


232 



4 


REPORT OF COMMITTEE ON CELESTIAL MECHANICS 


rier or Gaillot are used, have become very small with Newcomb’s 
tables, and the observations of Neptune are not sufficient for the 
purpose. 

From time to time, numerical relations of the masses, distances, 
etc., like those contained in Bode’s law, appear but have not so far 
given theoretical results. A curious fact concerning the distribu¬ 
tion of the poles of the planetary orbits, noted by H. C. Plummer, 1 
deserves mention. 

Satellites. Neither the Uranian nor Neptunian systems present 
many points of interest to the theoretical astronomer, on account 
of their distances from the sun and Earth. The four inner satellites 
of Jupiter, partly on account of the librational relation between 
three of them and partly because of the possibility of testing the 
constancy of the rate of rotation of the Earth by observations of 
their eclipses, have been again considered by R. A. Sampson, 
who has worked out their theory and has published tables. The 
outer satellites present several features of mathematical and 
physical interest. In Hyperion and Titan, satellites of Saturn, 
there is another case of libration worked out to a limited extent by 
Newcomb. Since the issue of Vol. IV of Tisserand’s M^canique 
Celeste in 1896, which contains a full account of the work to that 
date, the orbit of Phoebe (Saturnian system), and of the sixth 
satellite of Jupiter have been worked out by Ross. 2 

Asteroids. Nearly one thousand of these bodies are now known. 
From the gravitational point of view, they possess the greatest 
mathematical interest, on account of the large perturbations pro¬ 
duced in their orbits by Jupiter. 

Long period inequalities constitute the chief difficulty in all the 
gravitational problems of the solar system, on account of the further 
approximations needed to obtain the required degree of accuracy 
of the numerical values of the comparatively large coefficients 
The older methods went ahead without reference to them and car¬ 
ried out the approximations as they were needed, as for instance in 
Hansen s method used by G. W. Hill for Jupiter and Saturn in which 
a very long period term with a large coefficient causes most of the 
trouble. In the newer methods initiated by Gylden and his follow¬ 
ers, among them Backlund and Brendel, an attempt is made to 
introduce such terms as early as possible so as to diminish the 
ultimate work of computation. 

1 M. N.R A. s., 76, p. 378. 

* Harvard Annals, 53, VI. 



REPORT OF COMMITTEE ON CELESTIA L MECHA NICS 5 

There are two types of oscillation which differ in their mathe¬ 
matical treatment and in their physical results. The ordinary 
long-period type is that in which a forced oscillation has a period 
near that of a free oscillation. But when the two periods become 
almost exactly the same, the free oscillation is compelled to take 
the period of the forced oscillation, and there is then a new oscilla¬ 
tion of finite period about this position: the latter is called a libra- 
tion. It is well known that though the periods of the asteroids have 
a considerable range, there are none certainly known whose periods 
are exactly */*, */«» Vs that of Jupiter, while there are considerable 
numbers with periods a little different from these fractions. It 
is obvious that the resonance has some relation to the distribution, 
but so far all mathematical investigation has failed to show any 
reason for the gaps: there is no evidence of instability in the deduc¬ 
tions from the equations of motion. Attempts to search for the 
cause in cosmogonic speculations or in a resisting medium have been 
made, but a more complete investigation of the gravitational effects 
is needed. The problem is similar to that of the divisions in Saturn’s 
ring, connected with perturbations produced by the satellites. The 
question is complicated by the fact that several cases of libration 
exist without apparent instability, e. g., amongst the satellites of 
Jupiter, of Saturn, in the Trojan group of asteroids whose mean 
period is the same as that of Jupiter and best known of all, in the 
rotation of the moon which has the same period as that of its revolu¬ 
tion around the earth. 

Numerous statistical investigations have been carried out, but 
little has been deduced from them, except in the way of confirming 
known perturbative effects. 

Closely related in importance to the foregoing purely theoretical 
considerations is the practical problem of suitable numerical meth¬ 
ods for the representation of the motion of the asteroids. Hansen’s 
and the Gyld6n-Brendel methods have been referred to. Of funda¬ 
mental importance for practical purposes are also the methods in¬ 
augurated by Bohlin for the group determination of the perturba¬ 
tions of planets which have a mean motion nearly commensurable 
with that of Jupiter. Bohlin’s method rests on Hansen and has 
been followed by Von Zeipel, Leuschner, and D. F. Wilson, by the 
former in application to the group */?, by the latter to the group 
Vs- Bohlin’s developments are general for all groups, with special 
application to the group '/s- A feature of these methods is the use 
of elements which are similar to the elements ordinarily known as 


234 



6 REPOR T OF COMMITTEE ON CELESTIA L MECHA NICS 

mean elements. In all of these methods the mean motion, eccen¬ 
tricity, and inclination may lead to complications. The success of 
any method depends upon the possibility of meeting the complica¬ 
tions arising from critical values of these three elements separately 
and jointly. This has not been attempted in practice to a degree 
of precision which would reveal any departures from the motion 
under the Newtonian law of the type of the motion of the perihelion 
of Mercury, although for planets with moderate inclination and 
eccentricity and a mean motion not nearly commensurable with 
that of Jupiter, Hansen's method appears entirely suitable. 

The principal aim of astronomers at the present time is to repre¬ 
sent the motions with sufficient approximation to serve purposes of 
identification and observation. Even with this limitation of 
accuracy the difficulties are considerable. Leuschner published 
preliminary results of his experience with the Watson asteroids. 1 
Unpublished later results of the planets 10 Hygiea and 175 Andro¬ 
mache verify his conclusions that the revised tables of von Zeipel 
for the group '/ 2 will give the most satisfactory results for all known 
planets of this group. It would seem therefore extremely advisable 
to have tables computed on Bohlin's or similar plans for other than 
the three groups for which they are available. The Trojan group, 
however, does not appear to lend itself to treatment by Bohlin's 
method without certain modifications. The perturbations of this 
group are being successfully dealt with by E. W. Brown, and Wil¬ 
kins 2 has represented the observations of 884 Priamus by his own 
treatment to within 10" of arc from one opposition to the next by 
including second order perturbations of the first degree in the 
eccentricity, inclination, and deviation from the center of libration. 
This method will more than answer practical requirements frpm 
one opposition to the next, while Brown's developments are in¬ 
tended to represent positions at any time. 

Brendel classifies the planets after Gyld6n as ordinary, character¬ 
istic and critical planets according to the ratio of their mean mo¬ 
tions to that of Jupiter, the critical planets being those of close com- 
mensurability. Application of his method has been made by Bren¬ 
del to one hundred planets with mean motions from 800 to 85*> 
The elements are instantaneous but not osculating and perturba¬ 
tions greater than 3'.4 within fifty years are included so as to re¬ 
produce the geocentric places within 20' for one hundred years. 

» lT e 208, gS 23°3 f thC Nati ° nal AC3demy ° f SCienCeS> 5 ’ PP 67 " 76 ' March - 1919 - 


235 



REPORT OF COMMITTEE ON CELESTIA L MECIIA NICS 


7 


The secular perturbations in the longitude of the perihelion and of 
the node are the same for all these planets. The periodic perturba¬ 
tions of the various elements adopted are computed with the aid of 
five to nine constants for each planet and four arguments, linear 
with the time. Later Brendel 1 has published improved elements 
for sixty of these planets and added three to the list. 

Among other applications of Brendel's method are those by 
Labitzkc 1 who has published mean elements and approximate 
perturbations by Jupiter for nineteen of the planets with mean 
motions between 780" and 857" with a somewhat lower degree of 
approximation than that aimed at by Brendel for the previous list. 
Boda 1 has published approximate perturbations for one hundred and 
eight planets of the group '/ 3 , Hestia group, with mean motions 
ranging from 845" to 958", in two groups according to whether the 
mean motion is greater or less than 897". The perturbations are 
not very large in these cases because the longitudes of perihelia are 
near those of Jupiter. These approximations by Brendel’s method 
serve a very useful purpose. 

Among other investigations which may serve for comparison of 
the suitability of various methods are those by Notebaum- and 
Osten. 3 The former has developed the perturbations of 433 
Eros after Leverrier. Commcnsurability with the mean motions 
of the three perturbing planets is not involved. The latter has 
developed the perturbations of the third order due to two major 
planets for 447 Valentine by Hansen's method, reproducing the 
normal places within =*= 1" to 2" over 19 years. A preliminary 
review of the present status of the determination of the per¬ 
turbations of minor planets is in progress under the direction of 
Leuschner. 

Of interest in connection with the question of stability for com¬ 
mensurable planets are certain numerical results. Brown's studies 
and von Zeipel's developments do not indicate instability. Miss 
Levy’s unpublished developments by the von Zeipel method place 
the mean mean motion of Andromache at 619.5" and according to 
Berberich's computations by the method of special perturbations the 
osculating value of the mean motion has diminished with slight peri¬ 
odic variations from 617.7" in 1877 to 607.8" in 1921, so that the 

» Brendel. A. .V. 195, 417; A. N.. 200, 1; Labitzkc. A. N.. 212, 217; Boda. A. N. 
212 , 219 

* A. N.. 214, 153. 

* A sir. A hh.. 15; A. .V . 210, 130. 


236 



8 


REPORT OF COMMITTEE ON CELESTIAL MECHA NICS 


hitherto accepted gap has been well invaded in this case. On the 
other hand Krassowski comes to the conclusion on the basis of Bren¬ 
ders theory that for planets of mean motion near 400" or 4 /s 
times Jupiter’s mean motion, the motion of an asteroid would 
become unstable between the limits 391.5" and 401.7". (Travaux 
de la Society des Sciences de Varsovie III. Classe des sciences 
math. et. nat. Nr. 12.) 

Comets. The principal question of interest regarding comets is 
that of their origin, whether they are members of the solar system 
with elliptic orbits or enter it from without in parabolic or hyper¬ 
bolic orbits. The possibility of the “capture” of a comet by diver¬ 
sion from a parabolic into an elliptic orbit has been recognized since 
the days of Laplace and discussed in much detail by H. A. New¬ 
ton. 1 Leuschner 2 has shown that the numerous “parabolic” 
orbits which appear in catalogues of comets, represent cases in which 
the observations are insufficient to detect the deviations from a 
parabola which almost always appear when the observations are 
sufficiently numerous and cover a long enough interval. In the 
latter cases the orbit is usually definitely elliptic. The osculating 
orbit near perihelion is occasionally hyperbolic, but Fayet 3 and 
Stromgren 4 have shown that in every case of this sort the approach 
has been in an elliptic orbit which was later converted to a hyper¬ 
bolic by planetary perturbations. 


All comets so far recorded appear, therefore, to be members of 
the solar system. The origin of the cometary orbits of shorter 
period, especially of the numerous group having a period between 
four and eight years, has been attributed to capture of comets of 
longer period by encounters with the major planets. All investi¬ 
gators agree in confirming this explanation in the case of the "Jupi¬ 
ter family described above. Russell 5 has shown that there is 
very little evidence of such capture among the comets of periods 
between 10 and 2,000 years, and that the supposed "families” of 
Saturn, Uranus, and Neptune have little or no foundation. Their 
large number is probably due to the disruption of an originallv 
smaller number of comets by close approach to Jupiter as suggested 

1 Memoirs Nat. Acad. Sciences. 6, 7-23. 1893. 

* Pud. A. S. P.. 19, 67-71. 1907. 


1 Annalcs de 1’observatoire dc Paris. Memories, 26, 1910. 

4 Pub. fra Kobenhavn’s Observatorium. 19, 61, 1914. 

* A. J., 33, 49-4)1, 1920. (This paper contains a number 
works.) 


of references to earlier 


237 



REPOR T OF COMMITTEE ON CELESTIA L MECHA NICS 9 

by Callandreau 1 and several groups of comets of probably common 
origin have been indicated by Fayet. 2 

These comets may have had substantially their present periods 
for an indefinite period or have attained them by the slow summa¬ 
tion of small perturbations of the ordinary sort. 

The motion of the matter in the tails of comets, which observa¬ 
tions show to be repelled by the Sun, and probably also by the 
comet’s head,* also presents problems of considerable interest, 
though the unavoidable lack of precision in the observations makes 
exact computations difficult. 

PART II. CELESTIAL MECHANICS AS APPLIED TO THE STARS 

The incidence of the application of mathematical analysis to 
sidereal problems is quite different from that which is found within 
the solar system. The determination of orbits and disturbed mo¬ 
tion here take a subordinate place, and are largely supplanted by 
questions relating to the statistical equilibrium of systems of great 
numbers of stars, to the internal constitution, rotation, and vibra¬ 
tions of gaseous masses, and to theories of the evolution of the 
Universe and its separate portions. Throughout this field the 
dynamical discussion must keep in close touch with statistical 
methods and, above all, with physics, especially atomic physics. 

1 The problem of the determination of orbits, whether of visual 
double stars or of spectroscopic or eclipsing binaries, differs radically 
from that presented by the orbit of a planet, comet or satellite, 
because of the great difference in the percentage accuracy of the 
observations. Errors amounting to ten per cent of the observed 
quantity are habitually met with, and it is, therefore, hopeless to 
attempt to derive reliable elements from the theoretical mini mum 
number of data. 

Only when numerous observations, covering at least a consider¬ 
able fraction of the period, are available, is it worth while to com¬ 
pute; and graphical methods, based upon curves drawn to repre¬ 
sent the whole course of the observations, are universally employed 
—though the later treatment is often in part analytical. The num¬ 
ber of existing methods of solution is considerable, and their practi¬ 
cal application is an art, fully as much as a science. Preference 
should be given to those procedures which enable the computer to 

1 A nnaUs <U I'Obs. <U Paris, 22, p. 1-47, 1902. 

* Bull. Ast., 28, 170. 1911. 

• Eddington M. M., 70, 442-458, 1910. 


238 



10 


REPORT OF COMMITTEE ON CELESTIAL MECHA NICS 


keep closest to the original observations, rather than to the empiri¬ 
cal curves which have been drawn to represent them. 

An excellent detailed discussion of all three sorts of binary stars 
will be found in Aitken’s recent volume. 1 

2. In view of these facts, it is not surprising that relatively little 
work has been done upon perturbations in multiple stellar systems, 
where orbital motion is shown. Such systems, so far as is known, 
always exhibit a close pair, attended at a relatively considerable 
distance by a companion (sometimes itself double) revolving in an 
orbit of much longer period. In a few cases 2 the system thus formed 
has itself a remote attendant, presumably in very slow orbital mo¬ 
tion. 


The masses of all the components, so far as they are ascertain¬ 
able, are of the same order of magnitude. The problems thus pre¬ 
sented are analogous to those of the Lunar, rather than the Plane¬ 
tary Theory, but are complicated by the great eccentricities of the 
orbits, which sometimes exceed 0.7. 

There is, however, not as yet a single system in which the orbital 
elements of both the close and the wide pair are fully known. When 
the close pair is telescopically separable, the period of the distant 
companion is so long that it will be several centuries before a re¬ 
liable orbit can be computed. Seeliger 3 has discussed the per¬ 
turbations in such systems. The cases in which one or both com¬ 
ponents of a visual binary are themselves spectroscopic binaries of 
short period are more promising, but certain elements (notably the 
inclination of the orbit plane) cannot be found from the spectro¬ 
scopic observations. Two such systems 4,6 have been observed long 
enough to show definite evidences of perturbations of the close sys¬ 
tem (advance of the line of apsides and changes in period) and it 
is very desirable that they should be studied analytically. 

3. The distribution and motions of the stars in space, and in 
particular the distribution of velocities which is known as “Star 
Streaming” afford an attractive field for dynamical study. Here 
we are concerned with the statistical distribution of the coordinates 


j R - G * Aitken * The Binar y Stars (New York, 1918) Chapters 4, 6 and 7. 

* or uemtnorum and « Hydrae. 

*H. Seri'gcr. Abhandlungen (U Munchener Akad., II Kl, 17,1011 (1888) (fCancri) 
Astronomtsche NachrichUn. 173, 327 (1907) (.Hydrae). * ' 

fw . 4 F # Hc “ rotcau ' Lick Observatory Bulletin, 9, 120 (1918). Period of close 

pair 5.97 days; -of wide pair 11.35 years. 


9 ‘ \ 3 Ceti , ; J - S ’ Par asfcevopoulos. Astrophysical Journal, 52, 110 (1920). 
2.08 days; long period 6.88 years. 


Short period 


239 



REPORT OF COMMITTEE OS CELESTIA L MECHA SICS 11 

and velocities of an enormous number of bodies under their mutual 
gravitation, and the analysis resembles in many ways that employed 
in the kinetic theory of gases.' 

The stars are. however, so small in comparison with the distances 
between them that the influence of collisions, or even of encounters 
such that their mutual attraction changes their directions of motion 
by a degree, may be neglected, unless enormously long intervals of 
time are involved. The “time of relaxation’ which would be re¬ 
quired to produce extensive alterations in the velocities of the stars 
by such encounters is estimated by Jeans 2 as 10 14 years and by 
Charlier 3 as 10' 6 years. 

The motions of the stars under the general attraction of the 
whole mass of stars have been discussed by Eddington 4 and Jeans. 6 
They find that a steady state in which star-streaming occurs is 
possible with a spherical "universe” and the direction of streaming 
radial, or with a “universe” shaped like a figure of revolution, and 
with streaming taking place along circles coaxial with the axis of 
symmetry. The first of these models is unlike the actual universe 
of stars, while the applicability of the second is uncertain. It is, 
however, very doubtful whether our universe is in a steady state— 
especially in view of the enormous time which would be required to 
reach one. Jeans 0 has shown that a similar type of streaming might 
be produced as the result of successive encounters of our ’ universe” 
with other star clusters, but in a later discussion 7 he reverts tenta¬ 
tively to the previous hypothesis. 

In the globular clusters, the motions are unknown, but the 
distribution of the stars in space is remarkably similar from cluster 
to cluster, and follows closely the law p = p«(1 + r 2 /’a 2 )“ ,x (where 
p is the density of distribution, r the distance from the centre, and 
p„ and a are constants). Jeans 8 and Eddington 9 have discussed 
this question. They find suggestions that this distribution may 
represent the nearest approach to a state of equipartition of energy 

» Summarized by Jeans, "Problems of Cosmogony" (Cambridge University Press. 
1919) Chapter X. 

= J. II. Jeans. M S R ,1. .«> . 74, 112 <1913). 

’C. V. L. Charlier. Mtddtlandcn fr&n Lunds Aslron. Observalortum. Senes II. No. 

11 ’ «*A S^Hddington. M S R A S . 74, 5 M913); 75, 366 (1915). and 76, 37 (1915). 

* J. H. Jeans. M. S R A. 5.. 76, 70 (1915). 

* J H. Jeans. M. S. R A. 5.. 76, 552 (1916». 

: "Problems of Cosmogony." pp. 236-242. 

» M S R A 5., 76, 567 (19161; "Problems of Cosmogony," p. 245. 

* M .V. R. A. 5., 76, 572 (1916.. 


240 



12 REPORT OF COMMITTEE ON CELESTIAL MECHANICS 

which is possible without a scattering of the outer stars to infinity; 
but the problem is by no means yet solved. 

4. The modem theory of the internal constitution of the stars 
begins with the work of Schwarzschild 1 who called attention to the 
importance of “radiative equilibrium.’’ The transfer of heat out¬ 
ward from the interior takes place almost entirely by the emission 
of radiation and its absorption in overlying layers, and the tem¬ 
perature gradient is determined by the outgoing flux of heat and 
the opacity of the material to the radiation. Schwarzschild dealt 
only with the atmosphere of the Sun, but Eddington 2 extended the 
analysis to the interior of the stars, both those of low density, with¬ 
in which the simple gas laws are obeyed at all points, and later 3 
to those of higher density. He was the first to point out the impor¬ 
tance of radiation pressure, which at the very high temperatures 
that prevail inside the stars becomes great enough to counteract 
a large part of the gravitational force, and of ionization, which 
breaks up most of the atoms into nuclei (or small nuclear groups) 
and free electrons, and greatly reduces the molecular weight. Upon 
certain plausible assumptions, he concluded that the ratio of the 
radiation pressure to the total pressure is constant throughout the 
star. Hence, the gas pressure is proportional to the fourth power 
of the temperature—a relation which suffices to define the whole 
internal constitution of the star, when combined with the law con¬ 
necting temperature, pressure, and density, for which in general, 
Eddington adopts a simplified form of Van der Waals* Law. 

If 0 is the ratio of the gas pressure to the total pressure it follows 
that (1 — fl) / d 4 = CM 2 w‘ where M is the mass of the star, m the 
mean molecular weight of the material composing it, and C is a 
constant, which is the same for all stars of low density, and depends 
only on fundamental physical constants of gravitation, radiation 
and gas-theory. 

For bodies of mass less than about 3 X 10 32 grams, (1 — 0)/0* 
is small, and the radiation pressure is almost negligible. For 
masses greater than 3 X 10 31 grams (1 — 8)/8* is large, and the 
radiation pressure almost neutralizes gravitation, and is the domi¬ 
nant influence in the internal equilibrium, 'file interval within which 
the change takes place is exactly that in which all known stellar 
masses lie. The masses of the stars appear, therefore, to be deter- 

1 Gottingen Nachrichten, 1906, 141. 

* M. N. R. A. S., 77, 10 (1916). 

* M. N. R. A. S 77, 506 (1917). 


241 



13 


REPORT OF COMMITTEE ON CELESTIAL MECHANICS 

mined by the fundamental properties of atoms. Bodies of smaller 
mass do not radiate enough to be visible as stars, while those of 
greater mass are in such a delicate state of equilibrium that they tend 
to break up into smaller masses. 

When the mean density increases, and departures from the 
simple laws must be considered, the constant C decreases rapidly. 
Eddington has determined it by quadratures. 

The total radiation from a star's surface is proportional to 
M(1 - (3)/K where K is the coefficient of opacity of the material. 
As the star contracts, the radiation will be relatively great, and 
nearly constant, till the density becomes considerable, and will 
then fall steadily. 

Certain of Eddington's assumptions have been severely criticized 
by Jeans, 1 who proposes a modified theory. 8 The principal point 
at issue is the constancy of the ratio of radiation pressure to gas 
pressure throughout the star. This will be constant, if, and only if, 
the product K-j is constant, where K is the opacity of the material, 
and t) the mean rate of generation of heat, per unit time, per unit 
mass, in the portion of the star nearer the center than the region con¬ 
sidered. It is very probable that t, increases toward the center and 
that K decreases. Eddington, in his original discussion, took them 
as separately constant; but this is unnecessary. He does not pre¬ 
sent his results as rigorous solutions of the physical problem, but 
as solutions of a simplified problem analogous to the more com¬ 
plicated reality. The manner in which his "model” agrees with 
the known properties of the stars is so striking as fully to justify 
his contention that the assumptions on which it is based are probably 
not far from the truth. 

Jeans has further shown that, if the energy of the stars is derived 
from gravitational contraction, those of moderate mass would 
develop more rapidly than those of either small or great mass. 

5. The oscillations of a gaseous star about its normal equilibrium 
have been discussed by Moulton’ and Eddington 4 with a view to 
the explanation of the variation of stars of the short period or Ce- 
pheid type. Moulton's conclusion that a very small oscillation from 
a prolate to an oblate spheroidal form would account for great 

i M N R. A. S., 78, 28. Reply by Eddington, Ibid., 78, 113 (1917). 

j M N. R A. S„ 78, 36 (1917) and 79, 319 (1919) "Problems of Cosmogony. 
Chapter VIII. 

• A strophysical Journal, 29, 261 (1909). 

« M N. R. A. S.» 79, 2. 1918 and 79, 177 (1919). 


242 



14 


REPORT OF COMMITTEE ON CELESTIAL MECHANICS 

variations in brightness, does not appear to correspond to the 
phenomena. 

The type of pulsation considered by Eddington (which appears 
far more promising as an explanation of the facts) is a bodily expan¬ 
sion and contraction during which the radius changes by ten or 
twenty per cent, the temperature rising as the star contracts and 
falling again as it expands. The changes in temperature account 
for the variability in light and color, while the outward and inward 
motions of the surface explain the observed changes in radial ve¬ 
locity; certain important details of the observed variation, especially 
the wide departure of the oscillations from simple harmonic motion, 
remain incompletely explained. The conclusion of the most general 
interest is perhaps that if the energy supply of the star 8 
Cephei were derived from gravitational contraction alone, its period 
should be shortening several hundred times as fast as the actually 
observed rate of change. 

6. The problem of the configurations of equilibrium of a rotating 
incompressible mass of fluid is an old one. Successive stages in its 
development are marked by the spheroid of Maclaurin, the ellipsoid 
of Jacobi, and the pear-shaped figure of Poincare. The stability 
of this latter figure had been investigated by G. H. Darwin and 
LiapounofT, with contradictory results, and the problem was only 
resolved in 1916 by Jeans, 1 who showed that an expansion to the 
third order of small quantities was required to arrive at decisive 
results, and that the pear-shaped figure was unstable. 

It follows that a slowly contracting and rotating mass of incom¬ 
pressible fluid, after following the series of spheroids and ellipsoids, 
would become physically unstable, and go through a period of 
rapid change, and probably break up into two separate masses. 
The more difficult question of the behavior of a rotating compressi¬ 
ble mass is of greater astrophvsical importance. This too has been 
attacked by Jeans. 2 He finds that there are two possible mechan¬ 
isms of breaking up. The mass may divide into two (or perhaps 
more) isolated parts by fission, or the centrifugal force at some point 
or line upon its surface may become equal to the gravitational 
attraction, and a stream or sheet of matter may be thrown off into 

'"On the Potential of Ellipsoidal Bodies.'* Philosophical Transaction A, 215, 27 
(1914). "On the Instability of the Pear-Shaped Figure. "Phil. Trans. A., 217, 1 (1916). 
"Problems of Cosmogony," Chapters IV. V. 

* "The Configurations of Rotating Compressible Masses." PhilosophicalTransac- 
tions. A. 218, 157 (1917). "Problems of Cosmogony." Chapter VII 


243 



REPORT OF COMMITTEE ON CELESTIAL MECHANICS 15 

space. When the mass is of uniform density the first of these proc¬ 
esses happens: when it is greatly concentrated at the center, the 
second. 

Intermediate situations may be represented by assuming that the 
compressible material obeys the equation of state p = Kpr (where 
y = oo gives homogeneity while y * * A * s f° un d to lead to an 
infinitely great central condensation). It is found that fission 
takes place if y exceeds 2.2, while if y is less than this limit, the 
spheroidal surface is distorted, develops a sharp edge, and matter is 
thrown off in a sheet in the equatorial plane. This limiting value 
of y corresponds to a surprisingly low degree of central condensa¬ 
tion in a spherical mass—the central density being only about three 
times the mean density. 

7. The origin and evolution of binary stars has been consider¬ 
ably discussed. Direct observational evidence, showing pairs of 
stars revolving almost in contact, makes it almost certain that such 
systems have been formed by the fission of a single rotating mass, 
and this explanation may be accepted for the spectroscopic binaries 
of short period. The densities of these stars are, however, so low 
that it is doubtful whether the value of y of the gas of which they 
are formed can be as great as 2, but Jeans has shown 1 that the influ¬ 
ences of ionization and radiation pressure may remove this difficulty. 

The wider visual binaries, with periods from five years to many 
centuries, present a more difficult problem. Moulton 2 and others 
have proved that their periods can never have been very much 
shorter than at present, and that, if they were formed by fission, 
the density at the time of separation must have been exceedingly 
small. Moreover, their orbits are often very eccentric, and Nolke 3 
has shown that tidal friction is incompetent to produce such high 
eccentricities from the nearly circular initial orbits. Triple and 
multiple systems (as Russell 4 has shown) exhibit a grouping into 
close pairs, with widely distant companions (cf. 2 above) which is 
a necessary consequence of the theory that they have been formed 
by repeated fission, but may also be a consequence of other modes 
of origin. 

Jeans 5 has pointed out that the effect of encounters between bi- 

• Phil. Trans. A.. 218, 208 (1017). 

* F. R. Moulton. Aslrophysical Journal. 29, 12-13. 1909. 

» Fr. Nolke. Abh. Nat. Ver. Bremen. 20, Teil. 2. 1911. 

« H. N. Russell. Aslrophysical Journal. 31, 185 (1910). 

‘ M N. R. A. 5.79, 100 (1918). and 79,408 (1919); '‘Problems of Cosmogony/* 
Chapter XI. 


244 



16 


REPORT OF COMMITTEE ON CELESTIA L MFC HA NICS 

nary systems and other stars which happen to pass near them tend 
in the long run, to an equipartition of energy between the radial 
and transverse components of motion, and to an average orbital 
eccentricity of 0.64—a little greater than the observed mean value. 
He supposes that most of this action has taken place in the remote 
past, when the stars were closer together than now (see below) and 
that the majority of visual binaries probably started as neigh¬ 
boring nuclei when the stars were originally formed (in agreement 
with a previous conclusion of Moulton 1 )- 

8. The theory of rotating masses has been employed by Jeans- 
in a bold and brilliant hypothesis of the origin of spiral nebulae, 
and even of our universe of stars. A huge rotating mass of rare¬ 
fied gas (or cloud of dust), would, upon contraction, assume a spher¬ 
oidal form, then as this grew more flattened, become lens-shaped 
till, in time, matter was thrown off by centrifugal force in the equa¬ 
torial plane. The tidal forces due to the attraction of the rest of 
the universe would localize the regions of ejection near two opposite 
points on the equator, and the gas, streaming out from these, would 
form two spiral arms enclosing the nucleus. 

This is a remarkably good model of a spiral nebula. Moreover 
Jeans shows that, if the rate of ejection is rapid enough, the out¬ 
going stream of gas will become unstable, and tend to break up into 
condensations or nuclei under its own gravitational attraction. 
Condensations of this sort are conspicuous on the photographs of 
many spiral nebulae. If the nebula is big enough, they may be so 
massive that they ultimately condense into stars. 

The masses calculated for spiral nebulae on this hypothesis are 
very great—5000 to 500,000,000 times the mass of the Sun -but 
the mean density is excessively low (4 X 10“ 17 grams per cubic 
centimetre). These values are roughly of the order of magnitude 
indicated by other observational data. 

More tentatively, Jeans postulates that a similar huge nebula, 
in the course of hundreds of millions of years, may have completely 
dispersed its substance into star-forming condensations, which, 
circulating in the general plane of the nebula, and spreading out to 
some sixty times its original diameter may have given rise to the 
existing galactic system of stars. 

1 See note 2, p. 15. 

* M. N. R. A. S., 77, 186 (1917); “Problems of Cosmogony,” Chapter IX. 


245 



17 


REPORT OF COMMITTEE ON CELESTIAL MECHANICS 

PART III. THE THEORY OF THE PROBLEM OF THREE OR 

MORE BODIES 1 

When three or more bodies (taken as particles) move according 
to the Newtonian law of gravitation, the mathematical determina¬ 
tion of their motion presents great difficulty. The so-called re¬ 
stricted problem of three bodies in the plane is that special case in 
which two of the bodies move in circles about their center of gravity, 
while the third body is of negligible mass and moves in their plane 
attracted by them. We turn first to this case, which has particular 
importance for the reason that many of the fundamental character¬ 
istics of the more general problem appear in simple form. 

THE RESTRICTED PROBLEM OF THREE BODIES IN THE PLANE 

(a) Periodic orbits. 

Hill 2 was the first to realize the importance of certain periodic 
orbits in the restricted problem of three bodies for the lunar theory. 
Later G. H. Darwin 3 undertook extensive calculations based on 
mechanical quadrature and found other classes of such orbits not 
obtainable by Hill's methods. Moulton and his students 4 have 
applied the method of analytic continuation of Poincar6 to the 
treatment of various periodic orbits. Brown, 6 Stromgren® and 
others have concerned themselves with types of periodic orbits of 
particular astronomical significance. In general only a beginning 
has been made with the determination of all the types of periodic 
orbits. Rigorously proven qualitative results are rare. 7 

It should be noted that the periodic orbits referred to, form 
closed curves in the plane rotating with the finite bodies. Orbits 
of ejection in which the small body collides periodically with one 
of the finite bodies are included. 4 The singularity of collision can 
be completely disposed of by mathematical transformation. 8 

As in most dynamical problems there are many types of non- 

» For a report on recent literature sec E. O. Lovett. Quarterly Journ. Math., 42,252-316 
(1911). 

• G. W. Hill. Am. Journ. Math.. 1, 5-26, 129-147. 245-260 (1878). 

• G. H. Darwin. Acta Math., 21, 99-242 (1897). 

« F. R. Moulton. Proc. Math. Cong., Cambridge, England. 2, 182-187 (1913); Also 
see Periodic Orbits. Carnegie Inst., Washington, 1920. 

• E. W. Brown. M. N. R A. S.. 1911. 

• E. Stromgren. A. N.. 168, 105-108 (1905); 174, 33-46 (1907). 

» G. D. Birkhoff. Rend. Circ. Mat. Palermo 39, 265-337 (1915). Other references 
arc given in this paper. 

•T. N. Thiele, A. N. 138, 1-10 (1895); T. Levi-Civita. Acta Math., 30, 305-327 
(1906). 


246 



18 


REPORT OF COMMITTEE ON CELESTIAL MECHANICS 

periodic orbits. Poincare was the first to recognize sufficiently the 
central importance of the periodic orbits and conjectured that every 
orbit may be approximated to for an arbitrary length of time by a 
periodic orbit. 1 This conjecture has neither been proved nor dis¬ 
proved, but it has been shown that for very general classes of 
dynamical problems, including the restricted problem of three 
bodies, every stable orbit has either certain properties of recurrence, 
or asymptotically approaches and recedes from orbits with such 
properties. 2 

(b) I ntegrability. 

The concept of integrability is one which admits of various in¬ 
terpretations. Thus, if the differential equations of the problem 
under consideration are such that the coordinates may be expressed 
in terms of “known functions” of the time, the dynamical problem 
may be called integrable. Unfortunately a function which is not 
regarded as known at one time may be admitted later to the class 
of known functions. For instance, Painlev6 in his Stockholm 
Lectures 1 admits all functions defined by infinite series which con¬ 
verge uniformly. If this be accepted, it follows immediately that 
the restricted problem of three bodies is integrable. For, as was 
stated above, it is possible by change of variables to eliminate en¬ 
tirely the singularities of collision, and then find series of the type 
required. Unfortunately this type of integrability is of doubtful 
importance. 

Sundman 4 in his epoch-making work on the general problem 
of three bodies established that this problem also is integrable in the 
sense of Painleve. But he was not able to draw any conclusions 
therefrom, and his series were useless for purposes of computation. 

With so many more or less justifiable concepts of integrability, 
the question arises whether any one is to be preferred before the 
others. The answer seems to be that the differential equations of 
a dynamical problem should be called integrable in the vicinity of 
a particular periodic solution if the formal trigonometric series for 
this solution and nearby solutions converge for all values real or 
complex of the variables involved. In this sense it has not >'et been 
demonstrated that the restricted problem of three bodies is not 

1 H. Poincar6, Les mrthodes nouvelles de la Micantque Celeste; Paris, 1892-1800. 

- G. D. BirkhofF, Bull. Soc. Math. France. 40, 305-323 (1912); Acta Math., 43, 
1-110 (1920). 

3 Painleve. Lecons sur la Thiorie Analytique des liquations Differenlielles, professees d 
Stockholm. Paris, 1897. 

4 C. F. Sundman, Acta Math., 36, 105-192 (1912). 


247 



REPORT OF COMMITTEE ON CELESTIAL MECHANICS 19 

integrable although Poincar£ has shown that these series do not 
converge uniformly for all values of one of the parameters of the 
problem. 1 

(c) Reductibility. 

In order to obtain a comprehension of the restricted problem of 
three bodies, it is necessary to deal with the totality of orbits for a 
given value of the energy constant. A partial explanation is the 
following: For periodic orbits the motion may be followed in¬ 
definitely. But a non-periodic orbit may have such complexity as 
to approach and recede from periodic orbits and more generally to 
be so related to other orbits that it is not possible to isolate the 
orbit completely. 

Now there are essentially three arbitrary constants involved in 
the restricted problem of three bodies, namely the two relative 
coordinates of the particle and the angular coordinate giving the 
direction of motion. If these three constants be interpreted as 
rectangular coordinates in space, the totality of orbits may be 
represented as the stream lines of an incompressible fluid in steady 
motion in this space. To a periodic motion will correspond a 
closed stream line. Hence if we imagine the closed stream line to 
be cut by a stationary surface S, there will correspond to the suc¬ 
cessive intersections of this surface by a stream line a sequence of 
points of the surface. The transformation T of the surface which 
takes each point into the next following one on the same stream 
line, and in particular takes the point of intersection of the closed 
stream line with the surface into itself, has a very intimate connec¬ 
tion with the dynamical problem. In fact, by this means, first 
introduced by Poincar4,* we are enabled to reduce the restricted 
problem to the transformation of a surface into itself. 

Another way of seeing this reducibility is the following: Consider 
the direct variational periodic orbit of Hill in the restricted problem 
of three bodies. Orbits (for the same energy constant) which cross 
the line of apsides with nearly the same direction and abscissa as 
this periodic orbit are determined by the abscissa r of crossing and 
the direction given by an angular variable <r. When the small 
body projected in this manner crosses a second time after a com¬ 
plete circuit we have a new pair of variables r', a '. Now r, a may 
be regarded as the rectangular coordinates of a point in the plane. 
That transformation of the plane which carries r, a intor', a'consti- 

1 See note 1, p. 18. 


248 



20 REPORT OF COMMITTEE ON CELESTIA L MECHA NICS 

tutes a transformation T. It is to be observed that the variational 
orbit corresponds to an invariant point of the transformation. 

It will be found that the important qualitative properties of the 
orbit are mirrored in corresponding properties of the transforma¬ 
tion T. Thus if the problem were integrable in the specific sense 
referred to above, the transformation near the invariant point 
would be exactly of the nature of a rotation in which the angle of 
rotation varies with the distance from the invariant point. 

As Poincare showed, not only the orbits near a particular periodic 
orbit but the totality of orbits can be treated by means of a trans¬ 
formation T. The transformation may be regarded as a trans¬ 
formation of the surface of a sphere into itself, approximately like 
a rotation in which the angle of rotation varies with latitude. The 
two fixed points correspond to the fundamental direct and retro¬ 
grade periodic orbits. 1 ' 2 
(d) Stability. 

The outstanding problem of stability for the fundamental direct 
periodic orbit is this: will all slightly disturbed orbits remain in¬ 
definitely in the vicinity of this periodic orbit? In terms of the 
transformation T of a surface S, such stability would imply the 
existence of an infinite set of invariant closed curves as near as 
desired to the invariant point corresponding to this periodic orbit. 
In the interpretation by means of fluid motion, such stability would 
therefore mean the presence of infinitely many torus-shaped canals 
of stream lines enclosing the closed stream line which corresponds 
to the periodic orbit. 

Levi-Civita 3 has shown that when the periodic orbit is of such 
a type that the mean motion of the small body is commensurable 
with the mean motion of the two other bodies, there will not be 
stability in this sense. In fact, there will then be orbits approach¬ 
ing and receding from the given periodic orbit. However there 
remains the possibility that the degree of instability is limited. 

In the case of incommensurable mean motions the formal series 
are available, and this implies stability in the usual astronomical 
sense. However, there is a distinction between the type of mathe¬ 
matical stability referred to above and astronomical stability. 
The astronomical type can be treated by direct computational 
1 Sec note 1, p. 18. 

- G. D. Birkhoff, Rend. Circ. Mat., Palermo, 39,265-337 (1015). Other references 
are given in this paper. 

5 T. Levi-Civita, Ann. di Mat. ter. 3. 5, 221-309. 1901. 


249 



REPORT OF COMMITTEE ON CELESTIAL MECHA NICS 21 

methods, but the mathematical type presents the utmost diffi¬ 
culty. 

From the practical standpoint one may ask the following two 
interesting questions. Suppose that the body of small mass is 
subject to arbitrary slight disturbance, although initially in the 
fundamental direct periodic orbit. What is the least time in which 
the particle can deviate by a stated amount from that orbit? What 
is the probable length of time that will elapse before it deviates by 
this amount? Neither of these questions appear to have received 
consideration despite their interest for the lunar theory. 

THE PROBLEM OF THREE OR MORE BODIES 

Most of the facts outlined above have their analogues in the 
more general problem. 

The great increase in complexity precludes any attempt, at an 
enumeration of the types of periodic orbits. Nevertheless such 
orbits must be basic in any attempted treatment of the questions 
which arise. 

The question of ideal collision has been partially disposed of in the 
fundamental paper of Sundman referred to above and in later papers 
of Levi-Civita. 1 Sundman established that when the three bodies 
do not move in one plane a triple collision is impossible and that at 
double collision the behavior of the colliding bodies is essentially 
the same as in the ordinary two-body problem. 

Sundman went further and proved that the sum of the three mu¬ 
tual distances will exceed always a specified positive constant. 

Birkhoff has announced that the work of Sundman can be ex¬ 
tended to apply not only to the usual problem of three bodies 
attracting each other under the Newtonian law, but to n bodies 
under similar laws, and that furthermore Sundman’s methods may 
be applied to give the conclusion that if the area integral constants 
be assigned and if the mutual distances are small enough initially, 
the sum of the mutual distances increases indefinitely with lapse of 
time. 

For periodic motions the sum of these distances remains finite of 
course, and the same is true of motions asymptotic to these periodic 
motions. It appears possible, however, that in all other cases the 
sum of the three distances increases indefinitely. 

Thus in an idealized earth, sun, moon problem it is likely that 

» T. LeW-Civita. Acta Math., 42, 9&-144 (1919). 


250 



22 REPORT OF COMMITTEE ON CELESTIAL MECHANICS 

the earth and moon will recede from the sun while the moon ap¬ 
proaches the earth. 

In the case of n nearby bodies we may anticipate that one or 
more will recede gradually from the others with lapse of time until 
finally the bodies are all remote from each other or moving in nearby 
pairs as in the two body problem. The source of the potential 
energy required will of course lie either in the high initial velocities 
or in the near approach of the paired bodies. From time to time, 
then, these bodies may approach so near as to collide. These 
conjectures seem in harmony with the facts of stellar distribution. 

The interest of the pure mathematician in the problem of three 
or more bodies has been stimulated by its importance for an un¬ 
derstanding of the past and future of the stellar universe. The 
entrance upon the field of the theory of relativity of Einstein has 
altered this situation considerably. If the relativistic point of 
view prevails there can be little doubt that new factors of the ut¬ 
most importance will be introduced in astronomical speculation 
concerning great lapses of time, although for limited intervals of 
time the classical problem of three or more bodies will maintain its 
importance. Only the very simplest features of the modifications 
required by the theory of relativity have as yet been determined, 
mainly those for a very small body in the presence of a central 
body. 1 

1 K. Schwarzschild. Sits. Preuss. Akad. Wiss.. 35, 189-196(1916). See also W. 
De Sitter. M. N. R. A. S.. 76, (1916). 


251 



AN EXTENSION OF POINCARE'S LAST GEOMETRIC THEOREM. 

By 

GKOKGE D. BIRKHOFF 
of Cambridge, U. S. a. 


i. Introductiou. 

The Crowned Memoir by Poincar£. «Le probteme de trois corps et les 
Equations de la dynamique*. in volume 13 of the Acta mathematica contained 
the first great attack upon the non integrable problems of dynamics. Under the 
direction of Professor Mittao-Leffler. the Ada mathematica has had many 
remarkable articles, but perhaps none of larger scientific importance than this 
one. Its many ideas, in which the periodic motions took a central part, led 
naturally to Poincare's later dynamical researches. 

In a highly interesting paper, Sur un thfereme de geometries, published 
shortly before his death in volume 33 of the Reudiconti dd Circoto Matematico 
di Palermo. Poincare showed that a certain geometric theorem (proved by him 
in particular cases) would carry with it the answer to some outstanding questions 
concerning the periodic motions. The peculiarity of the method by which I 
obtained a general demonstration of its truth soon afterwards , 1 and the dynamical 
origin of the theorem itself, have suggested the extension given here. 

In thus responding to the kind invitation of Professor Noblund, I desire 
to render homage to Professor Mittaq Leffler, especially because of the inspiring 
tradition which he has established for the Acta mathematica. 

' Proof of Poincare's Geometric Theorem. Transactions of the American Mathematical 
Society, volume 14. or see a translation in volume 42 of the Bulletin de la Socitte Mathematique 
tie France. 

38 —252SO A (la malhemalica. 47. ImprimA le 22 dAcembr* 1S25- 


252 



29k 


George D. Birklioff. 


2 . Statement of the Theorem. 

Let r, stand for polar coordinates in the plane, so that r=a>o is the 
equation of a circle (' of radius a. A doubly connected ring If, bounded by 
the circle C and a closed curve* J' encircling C, as well as a second like ring 
/?, bounded by the same circle C and a like encircling curve will engage our 
attention. The two rings, If and /?,. are taken to be related in that a one-to- 
one, direct, continuous point-transformation T carries It into It,. Thus we may 
write 

(\=T(C), l\=T(n, It,= T(U), 

C-T-dCJ, / =T_ ,(/•,). 7?-7\_ ,(*,). 

where the meaning of the notation is manifest. 

The extension of Poincares last geometric theorem to be established here 
is as follows: 


Theorem. If I and /', are met only once by any radial line 0 «* constant, 
and if T carries points on (' and r in opposite angular directions (with respect 
to H) to their new positions on C and I’, respectively, then either [a) there are 
two distinct invariant points P of It and If, under T, or (/,) there is a ring in 
If (or If,) abutting upon C which is cnrried into part of itself by T (or 7'.,).* 


In the form enunciated by Poincar* the boundaries /’ and I\ coincide, 
while the alternative (&) is excluded by means of the hypothesis that an area 
integral 

J J PrilrilO (7‘>o) 

is invariant under T. 


The importance of the removal of the condition that f and 1\ coincide 
li es in the fact that the extended theorem may be applied to establish the 

• A chad cure will bo dctincd ns II,.- nnuin l-nundary of a finite. simi.lv connected. 
o|„ n continuum and .hr complementary open outer continuum. a ring is 11 ... region hounded l.v 
.»« dosed curves, one within the other. If Then- curve, do no. much, .he ring is a douldv con¬ 
nected open continuum. No other type of ring enter, here until the last section 8. 

The restriction made on the corves T and r, might he lightened in that these curves 
need only to be .right-handedly accessible- and -left-handcdly accessible-, as these term, are 
deHned in my paper .Surface Transformations and their Dynamical Applications- in volume 43 

ll nstmte' 1Z , , "■,* “" d —“* h « -a-ed so,dees .0 

illustrate the same type of extension, and appear, .0 I* ode.,,,,.,. f,„ ,hc dynamical applications. 


253 



An extension of Poincare's last geometric theorem. 


2»J‘l 

existence of infinitely many periodic motions near a stable periodic motion in a 
dynamical system with two degrees of freedom. Furthermore the existence of 
motions which are not periodic but are the uniform limits of periodic motions 
then follows at once. The actaal existence of such quasi-periodic motions has 
not been proved hitherto as far as I am aware. 1 In the present paper I do not 
enter into these dynamical applications. 

It is also worthy of note that the extended theorem does not involve the 
hypothesis of an invariant area integral, and so fails essentially in the domain 
of analysis situs. Furthermore the existence of two distinct invariant points is 
established, whereas the possibility of only a single invariant point has not 
hitherto been excluded. 

The outstanding question as to the possibility of an n dimensional extension 
of PoiNCABi's last geometric theorem must now be briefly referred to. 

An examination of the analytic properties of the motions near a given 
stable periodic motion in a dynamical system with n degrees of freedom, and 
of the corresponding transformation T to which it gives rise, is likely to show 
that there exist infinitely many nearby periodic motions. The theorem of 
Poincare appears merely as the qualitative expression of the essential elements 
of the analytic situation for //- 2 ; and in faot the most special case treated by 
Poincare then suffices to cover the dynamical applications.* To achieve the 
appropriate n dimensional generalization of the theorem, it is necessary to 
determine the qualitatively essential elements of the n dimensional analytic 
treatment. Probably this can be accomplished in a simple way. 


3 . d-Chalns. Lemma 1. 

Choose arbitrarily a number <J>o. 

By means of the transformation T any point P 0 on the circle C is carried 
into a point T(P 0 ) on C. An outward radial motion through a distance a 0 , 
arbitrary except that o£« 0 «J, carries T(P V ) to a point P, on the same radial 

' The notable investigations of H. Bohr have taken up the analytic representation of such 
motions. See. for instance, his recent papers: Znr Theorie dcr fa*t periodischcr Funktionen, volume 
45 . Acta mathematical Einige Sfttz. uher Fourierreihe fastperiodischer Funktionen. volume 23. 
Math etnatische Zeitschrift. 

* In my C hicago Colloquium Lectures on Dynamical Systemt. soon to appear in book form, 
I establish these assertions. 


254 



300 George I). BirkhofF. 

line. Similarly an outward radial motion of T(]\ ) through a distance a,, arbitrary 
except that o£a,<6, carries T(P, ) to a point P t on the same radial line. By 
continuing in this manner a d chain of points 

n. /v . 

is obtained, in which each point is derived from its predecessor by the application 
of T and a subsequent outward radial «motion through a distance less than «). 
The d-chain can only terminate at some nth stage when P„ falls outside of It, 
so that the transformation T is not there defined. Such a terminating 6 chain 
will be called finite. 

A precise condition for the non existence of any finite d chain is contained 
in the following 

Lemma 1. A necessary and sufficient condition that there exists no finite 
d-chain is that there exists in P an open ring 2: abutting on C, which is carried 
by T into a ring T(2) lying in 2' and radially distant from the boundary of 
by at least d in the outward direction. 

The sufficiency of the condition in obvious. For if u point P lies in such 
a continuum 2, its image T(P) does and so do also the points obtained from 
T(P) by an outward radial motion through a distance less than d, just because 
T(P) lies in T(2). Thus the successive elements P x , of a chain must 

continue to lie in 1 and so in P, inasmuch as P 0 lies in 2. 

The necessity of the condition is also easily established. We begin by 
considering the nature of the sets of points M 0 , .1 /,.. .. constituted by the point's 
P 0 , P lt ... respectively. 

The set M 0 is the circle C of course. 

The set M x is evidently the open circular ring 

a*ar<(t + d 

It contains the set M 0 and is made up of inner joints except for those of C. 

The set M. contains all the points of .V, and so of ,V„. In fact it is 
possible to find a single point P-, of C which is taken to P„ by T. Thus 
P 0 , P, will form a i chain of three |>oints so that P, is a point of .V. also. 

Furthermore, except for the points of C, all of the points of 31. are interior 
points.. In showing this to be the case it is clearly unnecessary to consider points 
P, which belong to 31,. For such as do not, the corresponding P, is an interior 


255 




An extension of Poincare'* last geometric theorem. 


301 


point of .1/,. The transformation T. being one-to one and continuous, will take 
1\ and its neighborhood into T{!\) and its neighbourhood. A further outward 
radial motion through a distance less than A will take T{1\) and this neighborhood 
into 1\ and its neighborhood. Hence l\ is an interior point of -V* in this 
case also. 

Finally, the set M t is connected, for it is obtained from the connected set 
T(I\) by an outward radial motion thymgh a distance less than A. 

Thus it is seen successively that M t , M t , . . . form a series of open connec¬ 
ted continua abutting on ('. each of which contains its predecessors. If there 
exists no Hnite d chain. an infinite series of such regions is obtained, all of which 
will lie in It. These will define a limiting open connected continuum abutting 
on C. This continuum 6' is of course nothing but the set of points which belong 
to some A -chain. 

Consider now the region Since if a point Q belongs to J/,,, the 

point 2*(V) belongs to .1/,..,. it follows that J(5) is an open connected contiuum 
abutting on C which lies in .S'. Moreover if 7*IV) be moved in an outward radial 
direction through a distance less than A, the point obtained will still belong to 
JThus every point of T(S) is radially distant from the boundary of 6’ by 
at least A in the outward direction. 

Consequently, if 5 were a ring it would be a region of the type declared 
by the Lemma to exist. But it is evidently conceivable that the part of the 
boundary of 5 accessible from infinity may not constitute the whole of that 
boundary. This will be the case when 5 occludes certain regions or parts of its 
boundary from infinity, and so is not a ring. 

Suppose now that 5 is not a ring and let 6* stand for the occluded point 
set. Clearly the set 5+5 formed by 5 and 5 does form a proper ring. We 
proceed to prove that this augmented region 5+5 has the other properties 
demanded of - in Lemma I. 

Clearly 5+5 lies upon It. since S does; and so 5+5 may be subjected to 
the transformation T. Also 5+5 is carried into all or part of itself by T. For 
if a point belongs to 5 it has been seen to be carried into a point of 5 by T; 
whereas if a point belongs to 5 and so is occluded by 5, it is carried into a 
point occluded by J(5), and all the more occluded by 5. so that it belongs to 
5+5 also. Moreover a similar reasoning shows that every point of r(5+5) is 
radially distant from the boundary of 5+5 by at least A in the outward direc¬ 
tion. For if such a point belongs to :T(5) it has this property with reference 


256 



George D. Birkhoff. 


.{02 

to the boundary of .S', and so of course with respect to the boundary of 5+>S'; 
whereas if a point belongs to 7’(5) it is derived from a point occluded by 5, 
and must be occluded by ^(.S), so that a further outward radial motion through 
a distance less than 6 gives rise to a point occluded by 5 and so in 5+6'. This 
last step involves the previously deduced relation between 5 and 7'(5). 

Hence in every case 5+5 constitutes a ring 2* having the properties stated 
in Lemma i. Thus the proof is completed. 


4. .Minimal 6 chains. 

Suppose now that there exists at least one finite d-chain. There will then 

be a least positive integer n, for which a dchain P 0% P . P„ exists such 

that P„ falls outside of II. 

Such minimal thchains have some interesting properties. For example it is 
obvious that a point P, of such a chain belongs to Mi but not Mj, j<i\ in the 
contrary case a finite d-chuin of fewer elements could be at once constructed. 
Thus P 0 is the only point of the d-chain on C, P, is the only point of the 
dclmin in the open ring «</•<«+d, and so on. 

The only other property which we shall require is not much less obvious: 
if Pi and Pj (/'52 i, i) lie on one and the same radial line, so that 2’(P,_ 1 ) and 
'l'(Pi-x) do also, then 7'(P,_,) and T{Pj- t ) will occur in the same radial order 
as Pi and Pj. 

To establish this fact, we note first that T(P t _,) and T(Pj- x ) will not 
coincide, for then P,_, and 7 # ,_i coincide, so that all the points of the chain 
between P/-, and P,_, as well as one of these two points might be omitted 
from the minimal chain. This is absurd. For a like reason P, and Pj will not 
coincide. 

Now suppose that P(P, .,) has an r coordinate which is less than that of 
Y (/jj-i). This condition will be satisfied if i and j are named in the proper 
order. The only possible radial ordering of the four points in question not in 
accordance with the statement to be proved is 

nPi- 1). T(Pj-i), Pj, P, 

where the radial coordinate increases from left to right; in fact, P, must lie 
further out than Pj which in turn is at least as far out as 7'(P,_,). (In this 
ordering it would be conceivable that 7'(P,_.) and Pj coincide.) But it is apparent 


257 




An extension of Poincare’s last geometric theorem. 303 

that Pj is then obtainable from T(P,- l ) by an outward radial motion through a 
distance less than d, and that P t is likewise obtainable from jT(P,_,). This is 
true because the radial distance from' T(P t -\) to P, is less than d. Consequently 
it follows that Pj is a point of M t and also that P, is a point of Mj. But the 
property first specified eliminates one of these two possibilities. Therefore the 
stated ordering must hold. 

5 . The auxiliary transformation E. 

Let now P 0 , P,,..., P„ be the points of any minimal d chain. From the 
property just established it follows at once that if P,, Pj, P k ,... ( 1 ^ k ^ 1 ,...) 

are the points of this chain which lie on a given radial line, then T(P t . t ), 
P(P^-i). T(Pt-i), ... occur in precisely the same radial order. 



Imagine u point Q to move outward from »•=•« along this radial line. It 
is nearly self-evident that a second point (J may be made to move simultaneously 
on the same line, so as to be always at as least as great a radial distance as (} 
but never exceeding it by as much as d, and furthermore so that when Q 

coincides with T(P, .,), T(Pj. i) t T(P^-i) -- V will coincide with P,, Pj, P t ,... 

respectively. 

This fact may be made graphically more evident as follows. Let/,, r tt ... 
be the radial distances of T(P t -*), P(P>-i), ... arranged in order of increasing 
radial magnitude, and let s lt s ft ... be the corresponding distances of P,, Pj, ... 
so that the inequalities obtain: 

• • •. 

o^x, — r,<d. r i <A . 


258 




304 


George D. Birkhoff. 


If we take the number pairs (r a ,# a ), (r f , s t ), . .. as the cartesian coordinates of 
points in the plane, join these points in succession by straight line segments 
(see Figure i), and extend the broken line so obtained to right and left from 
the two end points by lines making an angle of 45 ° with the positive r axis, 
the graph of a function s=f(r) is given by the broken line. If r be regarded 
as theH=adial coordinate of Q and as that of Q, the correspondence between Q 
and Q so defined has the desired properties. 

It is conceivable that Q coincides with Q at »•—«, in which case, however, 
# is of course nofc a P° int T (P‘-t ): for, if it were, Q must be T(P 0 ) and Q must 
be P t distinct from Q. In any case, by replacing the rectilinear part of the graph 
for r&r g by another of slightly less slope, a modified correspondence is obtained 
which makes Q fall beyond Q at the outset when , It is convenient in what 

follows to suppose this to have been done. 

In this way there is defined along every radial line on which points 
P, t Pj, ... of the minimal <J chain falls, a one-to-one, continuous, outward radinl 
motion through a distance less than 6 which takes every point r(/>,_,), T(Pj ^),... 
into its corresponding P‘, P )t ... 


AU of these linear radial motions maj be effected by a single one-to-one, 
continuous, outward radial motion of the plane through a distance less than 6 and 
defined for rSn. For imagine in the above figure (Figure i) a third 6 axis 
perpendicular to the plane of the a axes, and imagine all of the graphs drawn 
in their appropriate planes 8 = constant. These broken lines all rise in the r 
direction and lie at a vertical distance less than i above the plane s-r. Join 
the pairs of points of adjacent broken lines with the same r coordinate by straight 

7 rr, TheSe e,identb ' deBne * functi °" -/fr. *) giving rise to an 
outward radial motion E for rfis having the character required. 

The results of the last two sections may now be incorporated in the following 


Lemma, 2. If there exists a finite d ch.in and so a minimal d-chain 
radi», ' 's ' J" " minimum >' then th "- exists a one-to-one continuous outward 

C olrlTnd ,atr ? 1 d,,tenCe ' eSS th “ n ^ “ ««*• 


T(Po). T(P,), . .. T(P._,) 

into 


respectively. 


Pi.p,...p. 



An (‘.xlnisinn of Poinc-ar«”» last geometric theorem 


305 


6 . The auxiliary curve. Lemma III. 

We consider next the compound transformation TE obtained by following 
T with such a transformation E. Clearly TE is a one-to-one direct transforma¬ 
tion of R into a ring E{R X ), and carries the circle (' into a distinct continuous 
closed curve (\ which surrounds ('. Furthermore TE takes each point P u , 
. !*„ i of the minimal 3-chain corresponding to E into , P it . . . P H re¬ 

spectively. In fuct, any point P,-i is carried by T into T(P, i) and then by 
E into Pi. Since P 0 lies on C 0 =C, /*, will lie on (\. 

By the application of TE the doubly connected ring bounded by (' 0 and 
C\, is taken into a like ring bounded by (\ and C s . This second ring abuts on 
the outer side C\ of the first ring, and the point P t lies on (\. Thus, by per¬ 
forming TE successively, a succession of expanding rings C 0 (\, (\(\, ..C*-, ('„ 
is obtained, each abutting on its predecessor, while P 0 , P x , .... P n lie on C 0 , 
C m respectively. 

Of course this process would terminate earlier if any ring C, i C, (»•< //) 
extended beyond R. But ull points in C 0 C\ evidently belong to M x , all points 
in (\C t to M t , and so on, so that those in CV-i (V belong to M r , and cannot 
lie outside of R by the very definition of a minimal d-chain. On the other hand 
Pn on Cn does lie outside of R, so that part of the ring C*-\ C'„ does extend 
beyond R. 

At this stage it is convenient to take r and 0 as the rectangular coordinates 
of a point in the r, 0 plane. From any one selected determination of the trans¬ 
formation T in this plane, all the others can be obtained by a translation in the 
0 direction through any distance 2 Xvr(/ = i, 2 , ...). The circle C appears as a 
straight line r—a, parallel to the 0 axis; /' and /', appear as open curves lying 
above this line and extending indefinitely far to right and left, while C l% C t ,.. . 
are similar curves, (', lying above (\ <’ s above C l% and so on. All of these cur¬ 
ves are congruent in each interval 

2 kn ^0<, 2 ( 1 + i).-r. 

The rings CC l% (\ C t , . . . appear as adjoining strips. The compound transforma¬ 
tion TE carries each strip into the one immediately above it. 

In this new plane join P 0 , P x by a continuous arc P 0 P x without multiple 
points, crossing the strip C 9 (\ and having only P 0 and P x on C 0 and (\ respec- 
39—26280. Acta malhcmaticn. 47. Imprim* lo 22 dlcembre 1925. 


260 




306 


George D. Birkhoff. 


tively. The arc P 0 P, is evidently carried by TK into P, P, crossing the second 
strip C x C t . Again this arc P, P s is carried by TK into an arc P S P, on the 
third strip, and so on (see Figure 2 ). Obviously in this way a continuous curve 
^ 0^1 P* • • • P» without multiple points is obtained. 

Let Vo be the first point of P 0 P, ... P„ to cross the boundary /* of It. 
The point Vo evidently falls on P m - t P n but is not the end point P„. Let us 
consider the image of P 0 P, ... P B _, V 0 under PP. The transformed curve 
P, ... P* Vi >3 made up of the arcs 


P.P., P,P S . P.-.Pa, Pa Vi. 

and is obviously without multiple point.. It is clear also that the transfer,nod 
curve has no point in common with P.P,. Hence the auxiliary curve P.Q, has 



no multiple points. It has the further property that TE takes the part of it. 

° V ' Cr ° SS t’ in ‘° WhiCh Partlv 0U,sid - of * -nd crosses E 

. n!T 7? 8inCe '• ' ,aS 8eriM ° f -I—cntative points 

on h t ? “ r , eCtan *" lar C00rdinat -' —* ‘hose obtained from anv 

one by a mot,on to nght or left through a distance */•„, an inlinite series of 

congruent curves Ql are obtained. However, if we revert to r « as polo! 

coord,nates and choose the arc P.P, so as not to have multiple points in this 

plane, then it is evident that the curves P (J 

f , es 2 0 v 1 represented in the new plane are 

d,st,net from one another and without multiple points. 

The results thus obtained may be summarized in the 

Lemma 3. Under the hypotheses and notation of Lemma 2 , there exists a 
continuous curve without multiple points, * 


261 



An extension of Foincar^'9 last geometric theorem. 


H07 


p.Pi- r*-i Vo^- v.. 

such that the compound transformation ; P 2s carries the arc P 0 Q 0 crossing It into 
P x Q x crossing £(/?,), while P 0 P x crosses the ring bounded by C and E(C). 


7. The d-Theorein. 

On the basis of the above three preparatory lemmas, we can now prove a 
theorem, out of which the extension of Poincare’s last geometric theorem stated 
in section 2 follows: 


d-Theorem. If /’ and are met only once by any radial line 0 = constant, 
and if T carries points on C and r in opposite angular directions (with respect 
to 0 ) to their new positions on (' and 1\ respectively, then for any d>o either 
(a) there is a point P of P such that T(P) of 7?, is on the same radial line and 
is distant by less than d from P, or ( 6 ) there is an open ring 2 in R (or /?,) 
abutting on C, which is carried by T (or P_i) into a ring lying in 2 and 
radially distant from the boundary of 2 * by at least d in the outward direction. 


To establish this theorem it is evidently sufficient to prove that if there 
exists no region 2, there must exist a point P. 

If there exists no region 2 there will exist finite d chains by Lemma I, and 
then in virtue of the properties developed in Lemmas 2 , 3 there will exist an 
auxiliary transformation E and a curve P 0 P, P n -i Q 0 

Imagine now a point .1 to move along this curve from P 0 to ^o» 80 that 
its image A x under TE moves from P, to The vector AA X represented in 

the plane in which r,0 are rectangular coordinates (Figure 2 ) will rotate through 
a definite angle during this process, which we will designate by rot AA X . 

For definiteness let us assume that points of C have their 0 coordinate in¬ 
creased under T, so that then, by hypothesis, points of /’ have their 0 coordinate 
decreased under T. If a denotes the positive acute angle that the vector P 0 P, 


makes with the positive 0 


cis. while J denotes the angle between - and 


which <J 0 Q x makes with the same line, then the rotation in question is clearly 
ft — a, or else differs from (i — a by a multiple of 2rt. It is of central importance 
for what follows to establish that this rotation is precisely {i—a. 

Suppose that the strip bounded by C and E(P ,) which the auxiliary curve 
l*o Qi crosses, is deformed by a purely radial distortion so that E(C) and E(r x ) 


262 



.408 


George D. BirkhofF. 


(which is continuous and meets each radial line precisely once, because of the 
hypothesis made about J\) become straight lines r=0 and r=c while the line C 
is not moved. Meanwhile rot .1.1, taken along the deformed curves will alter 
continuously. Hence a measured in a similar manner will continue to be the 
precise value of rot AA lt or will continue to differ from it by one and the same 
multiple of 2 n. Moreover, a and ff will continue subject to the same inequalities 
as before: 

o<a< 

22*2 


Suppose now that the auxiliary curve as thus modified into a curve crossing 
the Strip a£ri Sc, is deformed further on this strip while P 0 ,l’ ,,y 0 ,y, arc held 
fixed. It is again clear that, because of the continuity in the variation of 
rot A A, so long as the curve does not acquire multiple points, the stated formula 
will continue true or false in this second process of variation. 

But in the first place the arc P.P, crosses the strip aSrSb while 1‘tJ, 
lie, outside of it. Hence P 0 I>, can be deformed on the strip into a rectilinear 
segment P 0 P, Moreover the arc F,Q,Q X crosses the strip bSrSc, and can 
obviously be continuously deformed into the broken line P, V . y, without changing 
the position of p„y 0 „ r y„ Hence we obtain by legitimate modification a 
broken line P.P.&y. where these ,mints are arranged in order of increasing r 
coordinates, while />, has a greater d coordinate than and y, has a lesser d 
coordinate than y„. I„ this normal position the validity of the expression fi-u 

wTth if. h 19 8e ' f ev,d ; nt Hence ■'* ““•* «“« held along the auxiliary curve 
which we started, no matter how complicated that curve may have been 

On account of the inequalities to which ,i and o were subjected, we con- 
elude therefore that rof .4.4, is positive a, the point 4 moves from P. to y, 
across li along the auxiliary curve. 

Now consider the modified transformation TE. where & denote, that ra- 
dial displacement which moves a point by I time, the distance that E does. 

imin d T ,7 ' WhMe * iS * he ‘ dentiCa ' information i„ which no 

Z ■ to rr h^‘ deSig " a,eS TK>Ul “ iS P,ain tha ‘ “ dimi " i9h - 

But this woi'ild COntinUOUSl> ' U " leSS J a " d coincide for some A. 

H nce hr 'o sZV' 36 “ ? theorem. 

, d o P m T ” a> ' , 6 eiClUded ‘ Con —*' n °e as A diminishes, 

4-, merely move along lines ^constant to the right and left of P 


263 



Ail ox tension of Poincare's Inst geometric theorem. 


3CMi 


an«l V<» respectively, the inequality rot A A t >o must continue to hold until/, 
reaches o. 

Therefore the angular rotation of a vector drawn from a point .1 to its 
image -I, under T— TE 0 , as .1 moves along the auxiliary curve P 0 is positive. 
If the auxiliary curve be continuously varied into any other curve across the ring 
It, this rotation must vary continuously, or we are led to an invariant point and 
thus to the alternative (//). Hence it never reduces to o. since the hypothesis of 
the theorem ensures that the vector .1.1, has a positive 0 component for .4 on 
(' and a negative t! component for .1 on /’. Thus the total rotation of the 
vector .1.1, is positive along any curve crossing It. 

It is now necessary to note the complete symmetry between T and 7'_, in 
the hypothesis and conclusion of the d-theorem. On this account we may take 
the inverse transformation Y'-i as fundamental, in which case the roles of It and 
It lt of 7’and are merely interchanged. Furthermore, the transformation 7'_, 
carries points on (' and in just the opposite 0 direction. For definiteness it 
has been assumed that T moves points on C and I’ to right and left respec¬ 
tively in the plane in which /• and 0 appear as rectangular coordinates. Conse¬ 
quently Y'_i moves points on C and to the left and right respectively in 
that plane. 

With this slight modification in mind we arrive at the conclusion that the 
total rotation of the vector drawn from a point It to its image 2/-» under T- x 
along any curve crossing the ring R l is negative. 

But as If crosses Jt l% Ii- X crosses It of course, and may be taken as a 
point .1. Hence we infer that rot .1,-4 is negative along any curve across It. 

This is ubsurd, since the total rotation of the vector .4,.l is precisely the 
same as that of .4.1, which has already been proved to be positive under the 
stated circumstances. 

Consequently the d-theorein is established. 

8. Completion of the proof. 

The hypotheses of the theorem stated in section 2 include those of the 
d-theorem, and in addition we may exclude the alternative (6) of the d-theorem 
for any positive d. Hence for every positive d there exists a point P of It which 
is carried by T into a point T(P) of H x on the same radial line and distant 
from P by not more than d. A sequence of such points P with d approaching 


264 



an* 


C»eor#»i- 1). Hirkhoff. 


o evidently has at least one limiting point in 11 and II ,, which is invariant 
under T. 

Thus the existence of at least one invariant point of It and II, is 
established. 

If now we recur to the auxiliary plane in which r and 0 appear as roc 
tangular coord.nates, and allow a point A to make a circuit in a positive sense 
of that part of II contained between two parallels to the ,• a*s at a distance 
2 " apart, it is clear that rot A A, over the circuit vanishes since the rotation is 
zero along the arc of C and the arc of r. and cancels along the other two 
boundaries. 

Evidently this circuit contains within it each invariant ,H>int only once, and 
thus the total rotation is the algebraic sun, of the rotations over small circuits 
about the separate invariant points ' But at a invariant point this rota- 

t'o„ “ V"' , V i0 " He " Ce th * re ■» •» two distinct invariant 

pomts unless there is „ single mMpl r invariant point A' with a rotation o 
about it. 

From the existence of a single invariant point the existence of a second 

invariant point follows m the genera, case by the above due to 

Poikcxhk. However, the proof that there does exist „ second ,//,/,«. invariant 
point is a much more delicate matter. 

w. win suppose that the-,- exist, one and only one invariant point A - , and 
show that we are then led to a contradiction by mean, of a slight extension of 
our earlier argument. 

Instead of considering a Kxed ,a»itive d, we shall employ a d(«) which 
varies from one radial line to another. An outward radial motion of a point !■ 

through a distance less than d refe ra then to the value of d along the radial line 

on which P lies. If d^o, the point P is to be held fixed. Evidently dchains 
and minimal d chains may be defined with respect such a variable d 

I " ' C “' se A 7*r We |,r0, "“ 1 ' *'*'** J “"J positive except 

alon. the single radial line through the single invariant point A Furthermore 

. is Obviously possible to select d so that it varies continuously with 0 and is 

always less than the distance of V to 7 (/'l or of T(P) to A" for any point P 

which r ,|,,e3ti °"' theSe diS,a,KeS bei " ? rWk °" ed «■* plane in 

winch ; and 0 appear as rectangular coordinates 

-1L* is 30 ■***• no point of .» v d-chain can be the invariant noint A' 

“* - '•■-“•Ur —y Invariant n,a, fruD1 ,. 0 ,„ ilIcriIt( 


(ion, 


265 



An extension of Poincare's last geometric theorem. 


an 

for such a point is obtained from its predecessor /*+ A by imposing upon T{P) 
an outward radial motion through a distance less than that of T{P) from K. 

Lemma i will continue to hold for this slightly modified type of d-chain 
provided that the outer boundary of the ring 2 ' is allowed to touch (' at the 
point where the radial line through K meets C. But no such region 2 can 
exist in consequence of the exclusion of alternative (fr) of section 2. Hence there 
will exist a finite dchnin, and thus a minimal chain P 0 , P x P n corresponding 
to this particular d( 0 ). 

With reference to this particular minimal chain we can set up an auxiliary 
transformation E having the properties incorporated in Lemma 2 . 

There then arises by consideration of a compound transformation T E a 
series of rings CC t , < \ C t ,.. . as before with the single difference that the two 
boundaries of a ring may touch at one point. The points P 0 P x may be joined 
by a curve in C<\ as before and so an auxiliary curve 

PoP^P.-iQoP.Qi 

is obtained as in Lemma 3 , except that this curve inay possess double points 
without crossing, on account of the possibility that successive curves C, (\, ( 
may touch at a single point. Of course this auxiliary curve cannot pass through 
the invariant point A\ which lies outside of the series of rings. 

Proceeding now us in section 7 we consider rot A A x along the curve P 0 % 
where .4, is the image of .1 under TE. The inode of determination of d (0) 
ensures that .4, is always distinct from A. It follows then as before that this 
rotation is positive along the auxiliary curve and remains so under the para¬ 
metric transformation T Ex as ). decreases from 1 to o. Hence rot .4.4, along 
this curve when .4, is the image of .4 under T must be positive. It will there¬ 
fore remain positive along any curve which crosses H and can be obtained from 
P 0 Q 0 by a continuous deformation without passing over A'. But even if the 
curve does pass over A', rot .4.1, is not thereby affected since the rotation is 
o about K. Hence along any curve whatsoever that crosses It, rot A .1, is positive. 

But operating with T-\, we infer that rot .l,.4=-/o< A A , is negative, and 
a contradiction is obtained. 

Thus the theorem is established. 

An easy extension of the above argument shows that either there exist 
two invariant points about each of which rot . 1 . 1 , is not o, or there exist infini¬ 
tely many invariant points. 


266 



Eatratto dul Periodic*} di Matematiche 
Luglio 1926 - Si-rie IV, vol. VI, n. 4 (pagg. 262-271) 


GIORGIO BIRKOFF 


STAB1LITA I! PERIODICITA NELLA D1NAMICA 


CONFERENZA TENUTA AL SEMINARIO MATEMATICO 
ROMA, 6 MARZO 1926 



BOLOGNA 

NICOLA ZANICHKLLJ 


KDITOKK 







Nello svolgere il soggetto del quale mi propongo di par- 
lare sarebbe possibile sviluppare alciiue generality, mu poiclie 
io mi rivolgo u mateiiiatici, fisici ed astronomi credo prefe- 
ribile cousiderare un esempio elementare die valga a diiarire 
uel tniglior modo possibile i prilicipii general!: ed b quello die 
cerelierd di fare. 

Quale e la caratteristica essen/dale dei sistemi dinamici? 
Secondo il mio parero 6 qnesta: 

Ad ogni is tan te t del tempo uu tale sistema £ complela- 
inente caratterizzato dai valori mi meric i di certe cordinate: 

*.(*)» **(<)»... *n«); 

in alfcri termini il sistema £ regolato dalle equazioni differen- 
ziali <lel movimento 


^ — ^»(^i»••• *»•) (*— lj 2,... n). 

Pei; esempio, so inia particella cade < in vacuo * e se x 
ed y significano rispettivamente la distanza e la velocity, 
queste equazioni differenziali sono: 

dx fly 

tt= g ' 

(g costante di gravity). 

Qui z t = x e x t = y sono le due coordinate nel sen so 
indicato. 

TiJsiste ancbe nna seconda carat teristica dei sistemi dina¬ 
mici della quale parlero pift tardi. 


269 



4 


Stub Hi Id e Periodicita nellti Dinamica 


Formulate die siauo le eqnazioni del movimento, tutto 
si riduce alia loro risolu/.ioiie matematicn, cio6 alia dctermi- 
nazione delle caratteristiclie di uu movimento qunlunque. 

Un tale movimento corrisponde a delle funzioui *,(*),••• %„(t) 
che all’ istaute di tempo t 0 preudouo lispettivamente i va- 


soddisfacendo inoltre alle eqnazioni del movimento. 

II problema generale della dinamica cos} concepito non 
differisce da qnello della soluzione delle ordinarie eqnazioni 
differcnziali; e invero £ la dinamica che ha obbligato i mate- 
matici a considerare qnesto campo vasto e fecondo. 

Le due questioni malematiche pifi interessanti nella di¬ 
namica souo quelle della periodicity o ricorreuza e qnella 
della stability. 

La prima conclusione sulla quale voglio fissare 1 atter.- 
zione 4 la seguente: 

In una certa misura la stability di un movimento esige 
la propriety della periodicity, o di una quasi periodicity. 

Io considered un esempio molto elementare per chinriie 
questo principio estremamente geuerale. 

Consideriamo una particella P, che si muove lungo una 
linea, assoggettata ad una forza dipendente dalla posizione 
e dalla velocity della particella: 4 questo un caso ussai ge- 
uerale. 

Se x significa la distanza della particella da uu origin* 0 
sulla liuea, e se y significa la sua velocity, si avrauuo le due 
eqnazioni del movimento 



f t y), 


dove l’accelerazione / & una fuuzione reale data, che noi 
suppouiamo aualitica. 

Per trattare il pift semplice dei casi supporremo auche 
che per la particella esista una sola posizione possibile d’equi- 
librio, 0. Questo equivale a dire che l’accelerazione / non si 
anuulla per P, eccetto il caso in cui P si trovi in 0 : 


/(*, 0 ) 


[ =0 (* = 0 ) 

| 4:0 (* =+= 0 ). 


270 



Stabilitd e Periodicitd nelln Dinamicn 


5 


Dire clie un inovimento b stabile b per definizione dire 
che la particella P non va ad ana distanza infinita da O e 
che la stia velocity resta sempre finita. In altri termini esi- 
stenl un nuraero M tale che: 


|*WI, \y(t)\<M P ep 

Bcco il metodo geometrico, adoperato priraieramente dal 
grande Poinoaii£ in casi ben piit complicati, e che ci con- 
durrft a I nostro scopo. 

Oonsideriarao x ed y come le coordinate ortogouali di un 
puuto Q del piano. Dunque a tutti gli stati possibili della par¬ 
ticella corrisponderanno dei puuti Q , 
e reciprocamente un inovimento x(t), 
y(t) sar& rappresentato in questo piano 
con una curva; ed una sola di queste 
curve passa per ciascuu pun to del 
piano con una direzione determinata 

tang 0 = ^ =/(*) y). 



Quando il tempo t varia, il pun to Q descrive la curva 
rappresentativa con una velocity le cui componenti nelle 

direzioni dei due assi sono rispettivamente jjf e : e resta 

inteso clie questa velocity b tutta altra cosa che la velocity 
della particella P lungo la iinea. 

In particolare per l’equilibrio b necessario e sufticiente 
che sia costantemente 



in questo caso la curva degeuera in un sol pun to, cio& tiel 
punto O. 

Tutto cib non b che, in tin caso particolare, la nota rap- 
presentazione grafica delle ordinarie equazioni differenziali. 
Ora possiamo procedere a considerare pii'i profondamente la 
questione. 

La curva rappresentativa del dato inovimento stabile 
cominein, per t = 0, in un punto Q 0 = (x°, y°) ed il punto Q 
res ter j\ sempre (i>0) in un quadrato, avendosi \x\, | y | ^ M 
in virtu' della propriety della stability. 


271 



6 


Slab Hi I a e Period icild nella Din arnica 



Se il puiito Q 0 si Irovsi nell’origine 0, si ha il caso 
d’eqnilibrio; ma noi non considereremo questo caso che b il 
pin semplice possibile. 

Se il pun to Q 0 si trova sull’nsse delle x> roa non in 0, 

si avrji -J* =M, <>)=#<>. Dunqiie il 
pun to Q non pub essere su queslo 
asso senza passare da uua parte al- 
Paltra. Oosi non iiuoee alia generality 
supporre che, nelPistante t = 0, Q si 
trovi nel semi piano superior?, come 
in figura. Questo significa che la velo¬ 
city iniziale della particella V b posi- 

dx 

tiva. In tale caso si avrb — = y> 0. 

Si conclude che il punto Q si muovcrb verso destra e con- 
tinuen\ cosl fiuo al moinento in cui Q si trovent snll* asse 
della x perchb allora si avrb: 

§=»• 

Dopo uu certo istante, Q sard, nel semipiauo inferiore, 
pur inuovendosl ancora verso sinistra, e vi resterft, fino al 
momento in cui Q si troveril ancora una volta sull’nsse 
delle Cos) continuando b evidente che la curva rappre- 
sentativa descritta da V sari\ coraposta da im niunero ttnito 
od infinite d’arcbi 

Q„Q,, QiQ,<~ 

•love Q,, Q sono i punti successivi sull’asse delle *, e dove 
ognuno' di questi arebi e taglialo in nil sol punto dalle secanti 
parallele all’asse ;/. In questi piuiti Q,, Q,, — la dire/.ioue della 
curva e verlicale, perchtS si ha: 

s=». f *»• 


Ma in queste considerazioni si pub procedere ulterior- 
ment-c. In primo Inogo due punti successivi Q,, Qi+, non si 
possono trovarc dalla stessa banda di 0: nel caso coutrario 

ty—fVC, y) avrebbe evidenteuiente segno opposto in questi 

272 



Stability e Periodicity nella Dinamica 


7 


due puuti e bisognerebbe cbe f(z, 0) si aunullasse in nn punto 
intermedio; il ehe sarebbe contrnrio all’ipotesi ehe vi sia uua 
sola posizione d’equilibrio, 0. 

Notiamo ancora una cosa. 

La cnrva rappresentativa non pud terminare, in corri- 
spomleuza a un tempo tinito, nel punto 0; il che sarebbe 
contrario al teorema fondamentale d’esistenza per le ordi- 
narie eqitazioni difFereuziali. 

ft in tan to dimostrato che la particella osciller& lungo la 
linea intorno alia posizione d*equilibrio un numcro n di volte 


ii = 0, 1,..., 0 ovvero n = oo, 



mentre che il tempo t cresce da 0 all’intinito. 

Ma le propriety topologiehe della cnrva descrittft da Q 
ci permcttono di dire che se il numero dei punti Q t , Q«— d 
in tinito, la cnrva tender* 

verso una cnrva interna C (che si 
pud ridurre ad O) o essa tender* 
verso una cnrva esterna Co la 
cnrva Q 0 Q { Q 2 ••• d una cnrva chiusa. 

Nel primo caso Q 3 d a sinistra di Q t ; 
nel secondo caso Q 3 d a destra di Q t ; 
nel terzo Q 3 coincide con Q x . 

Oonsideriamo in tan to il caso 
in cui il numero n d tinito. La 

particella P far* in questo caso 1111 numero tinito n di 
oscillazioni intorno ad 0 lungo la sua linea di movimento, 
dopo il quale essa si muover* sempre a destra o a sinistra, 
restando ad una distanza minore di M da 0 (ipotesi della 
stabilit*). Bisogna dunque che la particella P tenda verso 
la posizione <1’equilibrio, per consegueuza verso 0. 

Ora noi siamo arrivati intanto alia seguent.e conclusion©: 
o la particella P si trova in equilibrio; o P tende verso la 
posizione d’equilibrio oscillando un numero tinito od intinito 
di volte con oscillazioni smorzantisi; o P tende verso un 
movimento periodico oscillando iudefinitaraente intorno alia 
posizione d’equilibrio con oscillazioni sempre crescenti o de¬ 
crescent!; o P oscilla periodicamente intorno alia posizione 
d’ equilibrio. 

Noi dunque abbiamo constatato in questo caso elemental© 


273 


8 


Slabilitd e Periodicitd nella Din arnica 


il principio general© che ciascun movimento stabile d ’1111 sistema 
dinaraico d periodico, o s’avvicinaa un movimento perio<lico. 

Si tratto d’un fenomeno quasi del tutto general©. Dalla 
loro stessa detiuizione i movimenti iustabili avranno sempre 
tendenza ad essere sostituiti da movimenti stabili. X5 dunque 
cosa molto naturale avere sistemi dinamici i cui movimenti 
sono tutti del tipo stabile. 

Per questi sistemi dinamici completamente stabili, tutti 
i movimenti s’avvicinano di tratto in tratto ai movimenti 
periodici, ed io credo che questi movimenti speciali siauo 
determinativi per i movimenti pid geuerali. 

Per esempio se nel nostro problema della particella P, 
s’imponesse questa restrizione di stability completa si otter- 
rebbe una illustrazione <li questo principio. 

In questo caso si avn\ evidenteinente un uumero iufinito 
di movimenti periodici che possono essere ordinati secondo 
l’ordine delle loro ampiezze. 

Ogui movimento non periodico varier& fra due movi¬ 
menti periodici viciui di questa serie, con ampiezze sempre 
cresceuti o decrescenti quando il tempo t vari da 

t = — oo a t = -b oo. 

Molto vagainente tutto questo ci spiega perchd la stabi¬ 
lity del sistema solare e la rappresentabilito del movimento 
del sistema per mezzo di serie trigonometricho sono fatti 
strettamente legati fra loro; inoltre, questo spiega un poco 
perchd Poincak£ aveva ragione di rivolgere la sua atten- 
zione quasi esclusivamente verso i movimenti periodici. 

In un certo senso si pud dire che i movimeuti di questo 
tipo hanno rispetto i sistemi dinamici lo stesso ufficio che le 
singolaritA delle fuuzioni aualitiche rispetto lo loro propriety. 

Passiamo ora a considerare la secouda caratteristica fon- 
damentale dei sistemi dinamici, almeno in certi casi impor- 
tanti: uno spostamento intiuitamente piccolo qualuuque di 
un movimento particolare pud restare sempre piccolo nei 
due sensi del tempo, per esempio se il sistema dinamico d 
reversibile; e in verity in molti casi si presenta effettiva- 
mente questa circostanza. 

Se si studiano le equazioni del movimento nelP intorno 
immediato di un movimento particolare qualsiasi, si d con- 


274 



Stabilita e Periodicitd nella Dinamica 


9 


(lotti al risultato che, per la stability completa cosi concepita, 
b uecessario o sufficient©, aliueno <lal panto di vjsta formal©, 
che siauo soddisfatte nn’iufiiiUA (li comlizioui speciali per 1 
coefficient! degli sviluppi delle X,. PoinoakS ha dimostrato 
che queste eondizioui sono soddisfatte per le equazioni cano- 
niche di Hamilton; e, inversameute, sembra che se queste 
condizioni di stability sono soddisfatte le equazioui possono 
essese trasforroate in nna forma hamiltouiana. 

Nel nostro problema particolare questa condizione non 
sar& soddisfatta se non qnando tutti i movimenti siano pe- 
riodici: questo fafcto b del fcutto evident©. 

Dnnqiie il solo significato della forma hamiltouiana b, 
forse, ©he essa definisce i sistemi dinamici che posseggono 
la seconda propriety della stability completa. Qnalsiasi altro 
principio di variazione, come qnello di Halmilton, avrebbe 
lo stesso vantaggio. II fatto che la funzione jtriucijtale 11 
designa Penergia totale dipende solamente dalla scella attatto 
particolare delle coordinate e, secondo il mio modo di vedere, 
non ha il significato quasi misterioso che le si attribuisce; la 
struttura enorme e formal© che la forma hamiltouiana ha 
imposta a quasi tutta la dinamica sembra avere un’impor- 
tanza matematica anziclib fisica. 

Ohe cosa invece si pub dire della forma classica lagran- 
giana ? 

In questa forma con l’ausilio di concetti di spazio, di 
tempo, di corpo rigido, di leggi di forza come quella di 
Newton si pub scrivere immediatamente Penergia cinetica T % 
Penergia potenziale 17, e quindi Pequazione del raoviraento 
nella forma: 

d dT dT _ dU 
dt d4, dq { ~ dq< * 

dove t designa il tempo e dove le q { ... q n sono le coordinate 
relative alio spazio. 

E Pesperienza stessa mostra che queste equazioni sono 
applicabili in inolti casi cou tin sufficient© grado di precisione. 
Ma quanto pift si compremle Pimportanza foudamentale del- 
P elettricitbr, tan to pi ft appare che i metodi della dinamica 
non sono per la fisica pift sufticienti. Tuttavia tutti i metodi 
proposti fino ad ora per formula re le leggi fisiche ban no 


275 



10 


Stabilitd e Periodicitd nella Dinamica 


qualcosa di comune con i raetodi della dinamica: e infatti, 
la nozione di stato iniziale come determinante il raovimento 
esiste anche nelle teorie elottrodinaraiche. Si pud dunque 
sperare che la conclnsione pift notevole relativa alia perio¬ 
dicity gid citata si poti*y applicare nella uuova fisica: sola- 
mente si dovranno considerare coordinate in numero infinito 
e pud darsi cbe il principio di variazione mostrery ancora, 
nella pratica, la sua potenza. 

I concetti della dinamica hanno importauza non sola- 
mente da questo punto di vista. Si d cercato di estenderli 
nel campo della biologia, e cosl si d stati condotti alle teorie 
meccauicbe della vita. Ma, a mio parere, 1* ipotesi mecca- 
nica offre raolte difficoltd; ed d quanto voglio spiegare un 
poco. 

Per sviluppare il mio pensiero io considero l’uulverso 
attuale come una specie di cineraatografia; esiste dunque un 
numero di stati raolto numerosi, ma in numero finito, cbe 
possiarao indicare con 

a, by c... 

In questo caso si ba una successione particolare di films 
(a tre dimensioni) 

••• t*| Qy Cy b ... 

che dy la storia completa del mondo, cosl del raondo orga- 
nico come dell’inorganico. 

Dire che l’universo cosl concepito d meccanico equivale 
a dire che uno stato quale r d sempre immediataineute se- 
guito da un altro, a. Suppouiamo cbe cid sia vero. 

Siccoine uon esiste che un numero finito di lettere, nella 
serie qualcuua di queste lettere sard ripetuta. 

Suppouiamo, per esempio, cbe r sia una delle lettere 
indefinitamente ripetuta. Ma la lettera r d sempre segulta 
da stesse lettere disposte nello stesso ordiue. Bisogna dunque 
cbe la serie consista in uu succedersi di lettere differenti: 

r, a, c, b,... r, 
indefinitamente ripetute. 

Dunque questo piccolo uni verso meccauico avi*y sempre 
un andamento assolutamente periodico. 


276 



Siabilita c Periodicita nello Dimnnica 


11 


Noi potremo fare si cbe il nostro universo cinematografico 
sia meccanico in mi modo mi poco meno ristretto. A tal nopo 
supponiamo che la coppia di lettere successive come r ed a 
sia sempre seguita dal la stessa lettera c. Dunque se si scrive: 

(«, b) — A, (a, c) = B... 

si pu6 rappresentare lo stesso uuiverso mediant© una serie 
infinite di queste nuove lettere, e procedendo come prima si 
sarji condotti alia conclusione cbe 1’universo 5 ancora perio- 
dico quantuuque, nella parte cbe si riproduce qualcbe lettera 
a, b... possa essere ripetuta. 

Pifl generalmente sarsi possibile cbe ogui successione par- 
ticolare di h lettere sia sempre seguita dal la stessa lettera. 
Si tratterb ancora di mi universo meccanico e uecessaria- 
mente periodico. 

A1 contrario se la successione delle lettere non seguisse 
una legge di questa Datura 1’universo considerato non sarebbe 
meccanico. Si giuugerebbe a questo se tutte le lettere fossero 
state prese a caso. Ma, ancbe se le lettere sono scelte a caso, 
i rapporti di diverse lettere tenderanno all’unity in generale. 
In questo modo, si vede ancbe cbe i sistemi non meccauici 
sono soggetti a leggi di probability. 

La cosa riraane quasi la stessa se, invece di un universo 
cineraatografico, si consideri piuttosto un sistema dinamico 
completamente stabile. Iufatti come gib dicemmo, la stability, 
esige ancora la periodicity. 

Allora se 1’universo attuale fosse veramente un tale si¬ 
stema, il suo stato presente sarebbe ad un di presso ripetuto 
spesso indefinitameute nell’avvenire. 

Io non posso chiarire questo ordine di idee ma bo voluto 
semplicemente indicare come la relazione avvertita fra la sta¬ 
bility e la periodicity ofTra una difiicolty foudamentale per 
lMpotesi del meccanismo nel dominio biologico. 

Ad ogni modo tutti sanno cbe da luugo tempo le teorie 
meccanicbe sono state, per il pensiero scientifico, d’una im- 
portanza domiuaute. 

Infatti sembra cbe lo spirito nou possa fare dei progressi 
senza servirsi di metodi meccauici uelle scoperte. Lo stato 
ideale di una teoria scientifica e quello della dinamica percbe 
altrimeuti non poti*y essere esaurito lo scopo della previsione 

277 



12 


Stabilita e Periodicitd nella dinamica 


completa. Forse si potr& trovare che ogni parte separata e 
beii definite del uostro uni verso abbia tin carattere meccauico 
senza che tutto 1’uni verso sia soggetto a questa propriety. 

E poich& non £ possibile altrimeuti ben compreudere la 
funzione della dinamica per il pensiero scientifico,‘ho cercato 
appunto di sviluppare uu poco il significato della stability 
e periodicity nella dinamica. 


278 



DYNAMIQUE. — Sur la signification des equations canoniques 
de la dynamique. Note (*) de M. Gborgb-D. Bihkhopp. 


Si les fonctions X*(* = des n variables x ,, ...» x n sont reelles 

et analytiques, et si / designe le temps, le systeme difTerentiel 

<■) (i =' . b) 


definit le mouvement permanent d’un fluide dans I’espace des variables x„ 
II est evident que tout mouvement de cette esp£ce, dans le voisinage d’un 
point quelconque ou la vitesse n’est pas nulle, est equivalent a tout autre 
mouvement, inoyennant une transformation ponctuelle convenable. 

La situation est tout a fait difTerente, ou dans le voisinage d'un point oil 
toutes les X, sont nulles (cas d’^quilibre) ou dans le voisinage d’une trajec- 
toire fermee (cas de pdriodicite). Ici nous ne considerons que le premier 
cas qui est plus simple. 

On peut definir le cas stable d’cquilibre de la maniere suivante : Suppo- 
sons que le point dYquilibre se trouve a 1’origine O, et considerons un 
mouvement quelconque qui, k un instant donnl, se trouve a P 0 pres dc O. 
Dans le cas stable il existe, pour tout entier positif n, un autre entier m tel 
que les coordonn^es Xi sont exprimables par des somines trigonometriques 
k m termes 

m 

2 cosayr By sin(3y/) 

/=« 


avec une erreur d’ordre OP 0 , pendant un intervalle de temps au moins 
d’ordre reciproque. 

Dans le cas stable, il est presque evident que les n racines de ('equation 
caracl^ristique 

I dX/ 


dx, 


-XS /y =o 


(*) Seance du 3 o aoul 1926. 


279 





) 

doivent ctre purement imaginaires ou nulles. Afin de ne consid6rer que le 
cas typique le plus simple, nous ecartons la possibility des racines nulles. 
De plus, nous supposons que ccs racines 

— Z—~ l • •••• — M". —*. ( n = 2m) 

ne sont liees par aucune relation lineaire 

p,p,-h...-4-p„ll m = o, 


ou />,, p m sont des entiers. Un changement pryliminaire lineaire des 
variables dcpendantes suffit a donner aux equations la forme speciale 

“37=**//- ix,-*-..., = — (* = i, m), 


ou les termes omis sont des series en y, qui commencent par les poly- 
noines du second degre au rnoins. 

Pour la stabilite d'equilibre il faut qu’on puisse transformer les equations 
k une forme normale 

( 3 ) Yft ~ — 1 ** ~dT = — IT >/ (i = i, jm), 

ou les series formelles M, ne contiennent que les m produits £ m v) m 

et commencent avec les constantes a,; ce fait fondamental peut dire demontre 
tres brievemenl par une suite convenable de transformations. Les Equations 
en Y] ( ainsi obtenues sont formcllement integrates avec 

= ’n, = fi,e-*‘S =u (i = i, m), 

ou les variables S.y),, .... des M, sont remplacees par ©,</,, c m d m 

respectivement. Alors les ar it v. sont donneespar les series trigonomytriques 
correspondantes. 

Done la stabilite d’equilibre n.cessite la forme trigonometrique indiquic des 
peliles perturbations. 

Le meme procyde demontre que dans le cas d’equilibre instable les series 
trigonometriques sont remplacees par des series plus generates, analogues 
a celles etudiees par M. Picaid. Dans ce cas les coordonnecs ne sont pas du 
meme caractere presque periodique. 

Comment peut-on rcconnaitre les equations dont les perturbations sont 
trigonometriques? Les equations normales (3) nous donnent 

c,,ri,= c,d, — k, (# = i, ..m ). 


280 


Cela signifie que 


( ^ ) 


?t(* .. *m)=*l (/— 


ou les series <p, sont reelles et sans des termes lineaires. 

Un premier criterium pour les perturbations trigonometriques esl /'existence 
de m integrates <p, = k, qui commencent avec des termes quadratiques spcciaux. 

Pour aller plus loin nous remarquons que les equations (3) sont cano- 
niques, c’est-a-dire de la forme 

dj,_d II dn, dH 

di-dt,' m — dZ . m) - 

avec fonction principale 

9H 

H = 2 • • • £«»»«•) 

Imt 

oil les series u m ) sont definies par le systeme d’equations partielles 


.... u m )E 9 ( + £ “/j- (/ — •. m). 


)- 


dUj 


Mais les equations canoniques sont toutes comprises dans le seul priucipe 

de variation 


H dtl = o % 


oil Ion trouve une somme lineaire en rf*,, .... dr\, n , dt sous le signe d’inte- 

gration. Quand on emploie les variables x,, .... x n au lieu dc 5.. 

cette somme restera lineaire en dx tt da\ % dt. Par consequence, les 
equations (i) resultent d'un principe de variation 


( 4 ) 


I P* dj P„ dx n -4- dl | =s o. 


Done un autre criterium , necessaire et suffisant pour des perturbations tri¬ 
gonometriques , esl l'existence d un tel principe de variation (4). En choisissant 
convenablemenl les variables dependants, les equations peuvenl tire donnees 
sous la forme canonique. 

Le fait que la forme canonique possede toujours des perturbations trieo- 
nometriques a etc* demontrt par Poincare. C’est une situation inverse que 


281 



( 4 ) 

nous signalons ici, en vue de laquelle le principe de variation et les Equa¬ 
tions canoniques semblent perdre un pcu rimporlance presque mysterieuse 
qu’on leur a accordee. 

Tous les rEsultats susdits sont formels. Mais des questions intEressantes 
se posent sur la convergence des sEries employees, dont nous avons 
aborde l'Etude dans descas etendus. 

(ICxtrait .les Complex rendus des stances de l'Academic des Science 
I. 183 , p. 5 i 6 , seance du ao septembre 1926.) 


• 4UTHI^n-VILLAR9, IMPRI»iEUR-U*RAIRE DES COMRTES RENDUS DES SEAHCES DE L’aCADKMIR DES SCIENCE*. 
79 ^/p ,6 Pans. — Qua. des Grands-Aufuslins, SS. 


282 


tJber gewisse Zentralbewegungen dynamischer Systeme. 

VOD 

George D. Birkhoff. 

Vorgelegt in der Sitzung am 30. Juli 1926 durch R. Courant. 

1. Einleitung. 

Fur eine sehr allgemeine Klasse von dynamischen Systemen 
kann die Gesamtheit der Bewegungszustande derart in eine einein- 
deutige Beziehung zu den Punkten P einer gescblossenen n-dimen- 
sionalen Mannigfaltigkeit M gebracht werden, dafi fiir geeignete 
Koordinaten x n ...x H die DifFerentialgleichungen der Bewegung 
in der Umgebung irgend eines Punktes geschrieben werden konnen: 

~dt = *••*«) i = 1, ... n. 

Hier sind die X t reelle analytiscbe Funktionen ond t bedeutet die 
Zeit. Die Bewegungen werden dann dargestellt durch Kurven, 
die in M liegen. Durch jeden Punkt P 0 von M gebt eine und nur 
eine solche Bewegungskurve und die Lage eines Punktes P auf 
dieser Kurve andert sich analytisch mit der Anderung von P 0 und 
des Zeitintervalles P 0 P. Man stelle sich vor, dafi sich mit der 
Anderung von t jeder Punkt von M auf seiner Bewegungskurve 
bewegt, so dafi eine stetige Stromung von M in sich selbst er- 
zeugt wird. 

Die DifFerentialgleichungen der klassischen Dynamik sind von 
einer spezielleren Form. Insbesondere besitzen sie ein invariantes 
w-dimensionales Integral in M. Infolgedessen mufi jedes kleine 
„Molekul“ a von M spater (oder fruher) in jede zu einer Zeit t 0 
gehorige Anfangslage 6 0 Ubergreifen. Denn man nehme an, das 
Molekul habe sich in einem kurzen Zeitintervall t in eine voll- 
standig neue Lage bewegt, und betrachte seine ursprungliche 
Stellung und die Reihe der Stellungen nach r, 2r, ... Zeiteinheiten 

G ot c tt • • • 

0«9. d. WiM. NachrichUo. Math.-pbj<. Klasae. 1926. Heft 1. 


283 



£2 George B irk li off, 

Diese konnen nicht alle zu einander fremd sein. In der Tat, be- 
zeichnet i> den Wert des invarianten Integrals Uber tf 0 , so ist 
dieser Wert fiir <*/, <*„••• usw. derselbe. Da aber der Wert dieses 
invarianten Integrals Uber ganze M endlich ist — sagen wir V —, 
so kann die Anzahl der zu einander folgenden Lagen der Reihe 
von MolekUlen V/v nicht Uberschreiten. 

Daher greifen gewisse i-te und j-te (i < j) Molekule Ubereinander. 
Wenn aber diese Ubereinandergreifen, dann miissen sie dies auch in 
den entsprechenden Stellungen (j- i ) Zeiteinheiten fruher tun. Folg- 
lich Uberdeckt die (J - i>te Lage von a einen Teil ihrer Anfangslage. 
Mit diesem Argument und seiner naturgemaBen Erweiterung bewies 
Poincare *), daB im Allgemeinen die Bewegungen eines solcken dyna- 
mischcn Systems von klassischem Typus unendlich oft in die Urn- 
gebung irgend eines Anfangszustandes zuriickkehren und so eine Art 
Stabilitat „im Sinne von Poisson u besitzen. 

Es ist das Ziel der vorliegenden Arbeit, zu zeigen, daB mit 
einem uUtkiirUchen dynamischen System stets eine bestimmte, ab- 
geschlossene Menge von Zcntndbcivcguuycn verbunden ist, die eine 
Teilmannigfaltigkeit M,. von M erfullen, die diese Eigenschaft der 
„regionalen Rucklaufigkeit“ besitzen und denen sich alle andern 
Bewegungen des gegebenen Systems in einem ganz bestimmten 
Sinn anniihern. 

Offenbar kann diese Untersuckung auch als Studium der Eigen- 
sebaften gewohnlicher Differentialgleichungen des obigen Typus 
aufgefaBt werden; diese sind von so allgemeiner Gestalt, daB sie 
ganz abstrakt leicht behandelt werden konnen. In den Fallen 
n = 1, 2 sind die Resultate wohlbekannt; hier sind die Zentral- 
bewegungen durch die Gleichgewichtszustiinde und die periodiseken 
Bewegungen gegeben. 

2. Wandernde und nicht-wanderndc Bewegungen. 

Man denke sich einen willkurlichen Punkt P 0 aus der Mannig- 
faltigkeit m der Bewegungszustiinde. <5 sei ein offenes Kontinuum 
mit kleinem, £ nicht Ubersteigendem Durchmesser 2 ) und P 0 sei 
darinnen enthalten. Nimmt die Zeit zu. so bewegt sich dieses 
„Molekul“ a. Es kann sein, daB 1\ einen Gleichgewichtszustand 
darstellt; dann wiirde sich das Molekul in P 0 standig selbst uber- 


1; Met bodes nouvelles de la Mecanique Celeste, vol. 3. 

2) Es ist klar, daU die Entfernung in M in einer ziemlicli willkurlichen Art 
detiniert werden kann. Der Durchmesser einer Punktmengc ist die obere Grenze 
der Entfernungen je zweier Punkte der Menge. 


284 



Lbcr gcwissc Zcntrulhcwcgungcn dvnainiscbcr Systcme. 


83 


lagern. In jedem andern Fall wird sich <?, falls es klcin gcmig 
gewahlt ist, auBerbalb seiner selbst bewegen. da die Gcsehwindig- 

dx- 

keitskomponenten —im ganzen IVIolekiil in seiner Anfangslage 

annahernd dieselben sind, wie in 1\. 

Wenn es moglich ist, t so klein zu nehmen, daB o niemals 
wieder in seine erste Stcllung hineingreift, werden wir P u einen 
tvundeniden Pun It nennen. Im gegenteiligen Falle soil J\ ein 
nic/d-ivandcrndcr Punkt hciBen; natiirlieh ziihlen wir Gleicbgewichts- 
punkte auch zu dieser Kategorie. 

Es besteht nun, soweit diesc Definitionen reichen, eine sebein- 
bare Asymmetric zwischcn den beiden Riehtungen der Zeit /, aber 
man sieht leicbt ein, daB sie nur sebeinbar ist. In der Tat, greift 
das IVIolekiil a nacb t Zciteinhoiten wieder in seine Ausgangs- 
stellung iiber, so ist das r Einbeiten friiher aucb der Fall; denn 
die beiden sicb uberlagernden Molekiile (d. h. a am An fang) und 
<5 t (d. h. a nach r Zeiteinheiten) nehmen r Einbeiten vorber die 
sich uberlagernden Stellungen o_ f und a 9 ein. 

Somit ist der wandernde Punkt P 0 durcb die Tatsache cha- 
rakterisiert, daB das kleine Molekiil o, das P 0 entbiilt, eine w-di- 
mensionale Rbhre beschreibt, die sicb niemals selbst durclulringt, 
wenn t von — oo bis + oo lauft. Aus diesem Grunde ersebeint 
die (.'barakterisierung als wandernder Punkt gerechtfertigt, sobald 
I\ niemals in die unmittelbare Nacbbarschaft eines schon einmal 
durchlaufenen Punktes zuriickkommt. 

Die entsprechenden Bewegungen werden natiirlieh wandernd 
oder nicht-wandernd heiBen, je nachdem der sich bewegende Punkt 
der einen oder der andern Klasse angehort. 

Die Memje W con wandernden Pun/.ten aus 1 1/ bihlct — ivenn 
sie existiert — Baccyunt/shurven, die o//'cnc n-dimcnsionalc Kontinua 
erfiillen. Die iVcngc M , von iiicht-wandcrndcn Punltcn *) aus Mbildet 
die abycschlosscnc Komplcmcntiirmengc von Bewcyungshurvcn. 

Aus den eben angefiibrten GrUnden sind alle diese Bebaup- 
tungen klar, mit Ausnahme vielleicht der einen, daB W offen und 
]1 > abgeschlossen ist. Aber wenn P 0 wandernder Punkt ist, so ist 
es auch jeder Punkt des P 0 enthaltenden und sich nicht selbst 
uberlappenden Molekiils o. Dies zeigt sofort, daB W von offenen 
n-dimensjonakn Kontinuen gebildet wird, da ja jeder Punkt von 
W innerer Punkt ist. 


1) Diese muC vorhanden sein, da die Grenzjmnkte 
Punktes von M offenbar nicht wandernde Punkte sind. 


jedes sich bewegenden 


285 



84 


George Birklioff, 


Wenn die Menge M x von nicht-wanderndcn Bewegungen Pun Jet e 
enthalt , die nicht GrenzpunJete der Menge W sind , so bilden dicse einc 
Henge M[ von Be tee git ngsJeurven t die offene n-dimcnsionalc Kontinua 
erfiillen und die Eigenschaft der „rcgiona!en Riickluufiglicit u besitzen. 

Es ist klar, daft M[ aus einer Menge von Bewegungskurven 
zusammengesetzt ist; denn wenn irgend ein Punkt $ von M[ sich 
nicht in der unmittelbaren Nachbarschaft einer Bewegungskurve 
von IV befindet, so gilt dasselbc fiir alle andern Punkte der Be¬ 
wegungskurve, die Q enthalt. Ebenso liegt ein geniigend kleines 
Molekul, das Q enthalt, ganz in M[ t so daft M[ von offenen n-di- 
mensionalen Kontinuen nicht-wandernder Punkte gebildet wird. 
Daraus folgt die Eigenschaft der regionalen RUckliiufigkeit. 

Offenbar ist die Menge M t —M[ = M x die Menge der Rand- 
punkte der n-dimensionalen offenen Kontinua W l% M\. Die Menge 
M\ besitzt daher eine niedrigere Dimensionszahl als n. 

Worn die Zeit t zuninnnt oder abnimrnt, so ndlicrt sich jedcr 
wandernde Pun Jet der Menge M t der nicJit-uandcrndcn Bcwcgnngs- 
Jeurvcn . 

Der Beweis dieser fundamentalen Eigenschaft wandernder 
Bewegungen ist unmittelbar zu fiihren. Man denke sich eine kleine 
offene Umgebung der abgeschlossenen Menge M t und die abge- 
schlossene Komplementarmenge C , die nur aus Punkten von W 
besteht. Zu jedem Punkt von C kann ein Molekul o konstruiert 
werden, das so klein ist, daft es sich bei Anderung der Zeit nie- 
mals selbst Uberdeckt. Nun kann cine endliche Anzahl unter 
diesen Molekulen gefunden werden, die C vollstandig uberdecken 
und zwar so, daft jeder Punkt von C innerer Punkt mindestens 
eines dieser Molekule ist. Ein sich bewegender Punkt kann jedes 
von diesen nur einmal *) betreten (die MolekUle sind als festgeheftet 
aufgefaftt) und kann in ihm nur ein kurzes Zeitintervall verharren. 
Es ist daher einleuchtend, daft der sich bewegende Punkt nach 
einer endlichen Zeit alle Molekule, die durch seine Bewegungs¬ 
kurve geschnitten werden, passieren wird. Nach Ablauf dieser 
Zeit muft der sich bewegende Punkt in der gegebenen kleinen Um¬ 
gebung von M t bleiben; in gleicher Weise wird er stets in dieser 
Umgebung bleiben, wenn t unbegrenzt abnimmt. Da die kleine 
Umgebung von M t willkUrlich war, muft sich der Punkt, in jeder 
Richtung der Zeit, asymptotisch M x annahern, w. z. b. w. 


1) Es ist erlaubt, anzunelimen, daB die Molekule mit der n-dimensionalen 
Vollkugel homoomorph sind und von benachharten Bewegungskurven hoebstens 
einmal durchsetzt werden. 


286 


liber gcwissc Zentralbcwcgungen dyna mi seller Systeme. 


85 


8. Einige wclterc Eigcnscbaften dcr wand crude n Iteivcguiigcn. 

Ein etwas eingehendercs Studiam fordert gewisse weitere cha- 
rakterislische Eigentiimlichkeiten der Art der Annaherung der 
wandernden Bewegungen an die nicht-wandernden Bewegungen zu 
Tage. Da wir gesehen haben (Ende von Abschnitt 2), dali der 
sich bewegendc Punkt jedes der festen Molekiile, die C uberdecken, 
hochstens einmal betritt und in ihm nur eine kurze Zeitspanne 
verharrt, ist folgendes klar: 

Jcdcr ivandcrndc Punkt blcibt auficrliolb cintr gcgcbcncn Uni- 
gebung von M x allcs in allcm nur cine cndlichc Zcit T und vcrUiJit 
dicse Umgcbung fdies in al/cm nur fur cine cndlichc Anzahl von Mulcn 
A 1 ), tco N und T vollsUindig fcstgclcgt sind , sobald die Umgcbung 
von J\J gcua/dt ist. 

Nun kann man die Frage stellen: ist nicht das gesamte Zeit- 
intervall, vom ersten Austritt des sich bewegenden Punktes aus 
der beschriebenen Umgebung von M t bis zum letzten Eintritt in 
sie, gleicbmafiig beschrankt? Fallt die Antwort bejahend aus, so 
kann man daraus versekiedene interessante Scblusse zieben. Aber 
es sebeint kein Grund vorhanden zu sein, anzunehmen, dafi das im 
Allgem'einen der Fall ist. Urn die Art, in der die wandernden 
Bewegungen sich M t nahern, deutlicber zu machen, kann man so 
vorgehen: Es sei 

••• 

eine im Zeitintervall unbegrenzt wachsende Folge von Bogen der 
Bewegungskurven. Dabei sollen sowobl 1\ als auch Q„ fur jeden 
Wert von n in C liegen. Hier muB P„ auf der Kurvc vor Q„ 
kommen. Jeden Haufungspunkt einer Folge P H nennen wir einen 
«-Punkt (in bezug auf die gegebene Umgebung von M x ) und jeden 
Grenzpunkt einer Folge Q m einen /3-Punkt. 

Ware das Intervall zwischen dem ersten Austritt aus der 
Umgebung von M x und dem letzten Eintritt in sie gleickmaBig 
beschrankt, so konntc es offenbar keine «- und /3-Punkte geben. 
Existieren umgekebrt a- und ^-Punkte, dann ist dieses Intervall 
nicht gleichmafiig beschrankt. 

Sowohl die Menge der «-Punkte als auch die der /J-Pnnkte 
ist natUrlich abgeschlossen. Ferner: wenn P ein a-Punkt ist, so 

1) Urn die Zahl der Austrittc genau zu crhalten, geniigt es, die Zalil der 
Austrittc durcli die C ubcrdcckendc .Molekiile zu nel.mcn, wo dicse Molekiile die 
in der vorhergehenden FuCnote festgelegten Eigcnschaften besitzen. 


287 



86 


George Uirkho ff. 


ist. es offenbar auch jeder andere Punkt von C auf der Bewegungs- 
kurve durch P; eine ahnliche Bemerkung gilt naturlich fiir /3-Punkte. 

Weder die a- noch die /?-Menge kann einen H-dimensionalen 
Raum in C erfiillen. Angenommen, die a-Punkte warden einen 
Raum E von C erfiillen. X sei die grbfite Anzahl der Molekiile, 
die C iiberdecken und die jeder Punkt V im Innern von E betritt, 
wenn t von — oo bis +00 lauft. Alle geniigend nahen Punkte 
werden dieselben Molekiile betreten wie P und daher keine anderen. 
Da aber P ein a-Punkt ist, konnen unter diesen Punkte gefunden 
werden, die vor und nack jedem beliebigen Zeitintervall eintreten 
und daher andere Molekiile als P betreten. Es ist vollstiindig 
klar, dafi fiir Punkte, die geniigend nalie an P liegen, die X Ein- 
tritte in ungefahr demselben Zeitintervall erfolgen wie fiir P. 
Diese ScblUsse konnen wir so zusammenfassen: 

Soicobl die Mengc der u-Punkte ah auch die der fl-Punkte in C 
hesteht aus ycuissen Beiccgungskurven uml bcidc Mcnycn sind nir - 
gends dicht. 

Jetzt betrachten wir die Bewegung aller Punkte von C, wenn 
die Zeit t unbegrenzt wachst. Nach Verlauf eines geniigend langen 
Zeitintervalls werden offenbar alle Teile von C mit Ausnahme der 
unmittelbaren Umgebung von a-Punkten sich stiindig aus C heraus 
in die Umgebung von hinein bewegt haben; denn nur in der 
unmittelbaren Umgebung von «-Punkten gibt es Punkte P mit 
einem langen Bogen PQ auf der Bewegungskurve durch P, so 
dafi auch Q noch in C liegt. 

'Wenn die Zeit suninnnt (abnimmt), vcrlassen alle Punkte ion C 
auflerludb cincr yeyebenen Umgebung der Mengc der a-Punkte (p-Punkte) 
C in cineni eindeutig frstgclegten Zeitintervall , urn dann in der ge- 
gebenen Umgebung von M x zu blciben. Fiir Punkte innerhalb cincr 
solchen Umgebung der Mengc der a (fi)-Punkte ist die Zeit } in der 
C vcrlassen ivird, nicht fest begrenzt. 

4. Die Tolgc 37, M iy 3/,, .... 

Nachdem wir zu der abgeschlossenen Menge von nicht 
wandernden Bewegungskurven 37, gelangt sind, zwischen deren 
Punkten die Entfernung so definiert ist wie in 37, sind wir in der 
Lage, wandernde und nicht-wandernde Punkte relativ zu 37, (an- 
statt zu 37) folgendermafien zu definieren: Man nehme einen will- 
kiirlichen Punkt P 0 von 37, und ein offenes Kontinuum <s von 
kleinem Durchmesser, das P 0 und gewisse andere Punkte von 37, 
entbalt. Sehen wir von dem Falle ab, dafi P 0 Gleichgewichtspunkt 


288 



Uicr gcwisjsc Zcntralbcwcgungcu d\nainiscbcr Systcmc. 


87 

ist, und wiiblcn wir den Durchmesser von a klein genug, so konncn 
wir sagen, J\ ist ein wandernder Punkt relativ zu jl/, (wiewohl 
natiirlich nicht wandernder Punkt relativ zu M), wenn sich o bei 
zunebmender Zeit niemals in Jl/, i'tberlappt. Die andern Punkte in 
jl/„ Gleichgewichtspunkte mit eingeschlossen, sollen nicht-wandernde 
Punkte relativ zu Jl x heifien. 

Es ist klar, dab die Analogic zwischen Jl und jl/, eine voll- 
stjindige ist. Die nicht-wandernden Punkte jl/ 2 von jl/, bilden eine 
abgcschlossene Menge von Bevvegungen, denen sich jcder Punkt P 
der Menge \\\ der wandernden Punkte von jl/, asy mptotisch nahert, 
wenn die Zeit zu- oder abnimmt; und zwar mit der Gleichmafiig- 
keitseigenschaft, die wir bei .1/ gefundeD haben. 

Derselbe ProzeG kann jetzt in Hinblick auf jl l t als Basis be- 
schrieben und so kbnnen U’. und jl/, definiert werden. In dieser Art 
fortfahrend definieren wir jl/,, M t , jl/, .... Wir betrachten den ProzeC 
als beendigt, wenn irgend ein Jf i¥l mit jl/, iibereinstimmt, wo dann 
natiirlich keine IP) mehr auftreten und die M,, /} jl/,*,, ... auch mit 
Jl, zusammenfallen. Im Falle, dab der Prozeb nicht auf diese 
Weise abbricht, entstcht eine unendliche Folge von abgeschlossenen 
Mengen jl/, jl/,, M t ..., von denen jede in der vorangehenden als 
echte Teilmenge enthalten ist. Diese definieren eine einzige, be- 
grenzende abgeschlossene Grenzmenge jl/,., von Bewegungskurven, 
namlich von solchen, die alien Mengen der Folgen gemein sind. 
Eine weitere Anwendung dieses Prozesses auf jl/,, liefert Jf rnx , 
(so lange, als diese von ihrcn vorangehenden verschicden 
sind). So entsteht eine geordnete Menge 

M, jl/,, Jl, ... 
jl/,.,, jl/,,*., ... 

AI„, j +1 

in eineindeutiger Beziehung zu den Ordnungszahlen von Cantor 
bis zu derjenigen M r , wo der ProzeB abbricht, falls er iiberhaupt 
abbricht. Diese Elemente M p bilden aber eine geordnete Menge 
von verschiedenen abgeschlossenen Punktmengen, von denen jede 
einen unmittelbaren Nachfolger besitzt und in alien vorhergegangenen 
enthalten ist. Eine solche geordnete Menge muB abzahlbar sein. 

Also <jibt es eine ivohlgeoidncle abbrechende Folge von verschic- 
denen, abgeschlossenen Jlengcn 

Jl, M lt ... Jl,,, ... jl/ r , 


289 



88 


George Birkhoff, 


in dcr das Element M p + X , das auf irgend ein Element M p unh'ittelbar 
folgt , aus Kurven von nicht tcandernden Bewegungen relativ zu M p 
besteht, wall rend M r nnr aus Kurven von nicht tcandernden Bewegungen 
relativ zu sick selbst besteht. Ein Element M p oline einen uyimittel- 
baren Vorliiufer ist die Limesmenge seiner Vorldufcr. Die tcandernden 
Punlete W p von M p streben asgmptotisch gegen die nicht-wandcrnden 
Bewegungen A/ p>l . Und zwar so, dafl die Gesamtzeit , die irgend ein 
Punlit von ]\l p au/]crhalb einer gegebenen Umgebung von M p + X zubringt , 
und auch die ZaJtl dcr Austritte aus dieser Umgebung gleichmdfiig be - 
schri inlet ist. 

Die letzte Menge M r soli die Menge der Zentralbewcgungen ge- 
nannt werden. Offenbar wird die symbolische Zerfallung von M 
durch die Formel 

M = 2 W p + M r 

gegeben. Man konnte nun auf weitere Einzelheiten iiber den Bau 
dieser Mengen A/ p+1 (analog zu den Untersuchungen im Abschnitt 3 
fur M,) eingehn, indem man A/),*, in die Menge M p¥l von nicht 
wandernden Punkten relativ zu A/ p , die keine Grenzpunkte von 
W p sind, und in die abgeschlossene Komplementarmenge teilt. Es 
ist klar, daB M pMl seinerseits mit seinen Grenzpunkten ganz in 
jeder feblenden Menge enthalten ist, mit EinschluB von M ri da es 
ja aus nicht-wandernden Punkt relativ zu sich selbst besteht. 

Die Ordnungszahl r der Menge der Mengen M p ist enge mit 
der Dimensionszahl n von M verkniipft. Es ist leicht zu zeigen, 
daB fiir n = 1 diese Ordnungszahl hochstens 1 betragt (sie kann 
naturlich auch 0 sein, was der Fall ist, wenn alle Punkte Gleich- 
gewichtspunkte sind); ebenso ist fiir n = 2 diese Ordnungszahl 
hochstens 2. Weiter soli bewiesen werden, daB die Ordnungszahl 
n in den Fallen nicht uberschreiten kann, wo jede Menge M p aus 
Teilmengen mit einer einzigen bestimmten Dimensionszahl besteht. 
Da zuniichst, wie wir gesehen haben, die Bestandteile A/J, M" die 
Dimensionszahl n bezw. eine niedrigere haben (Abschnitt 2), so 
folgt daraus unsere Behauptung, daB entweder A/, aus M [ und 
seinen Grenzpunkten besteht, oder daB M x identisch mit M'[ ist. 
Im ersten Falle sind alle Punkte von A/, nicht-wandernd und r = 0 
oder 1. Im zweiten Falle A/, = M" x und es ist daher M x hochstens 
von der Dimension n — 1, da es aus Grenzpunkten von W x besteht, 
das selbst aus offenem n-dimensionalen Kontinuum besteht. Ver- 
fahrt man in gleicher Weise mit A/ a , M a ... A/,., so findet man, daB 
der ProzeB entweder nach weniger als n Schritten abbricht, oder 
daB der ProzeB mit einem M n von der Dimensionszahl 0 abbricht, 


290 



Uber gewisse Zentralbewegungen dynamischer Systeme. 


89 


das nur Gleichgewichtspunkte enthalt. Trotz dieser allgemeinen 
Gestalt der Uberlegung, die sogar in allgemeineren Fallen richtig 
bleibt, scheint es mir sehr wahrscheinlich, daB die Ordnungszahl r 
der Mengen M p die Zahl n fiir n > 3 iiberschreiten kann. 

5. Einigc Eigenscliaftcn der Zentralbewegungen. 

Eine erste, fundamentale Eigenschaft der Zentralbewegungen 
von M ist folgende: 

Die Zentralbewegungen besitzcn die Eigenschaft der rcgionalcn 
Rii clrt dufigkeit. 

Diese Aussage wiederbolt lediglich die Tatsache, daB M y keine 
relativ zu sich selbst wandernden Pnnkte besitzt. Wir haben dies 
wiederholt, einmal wegen seiner Wichtigkeit und dann, weil wir 
zeigen mochten, daB dies die Existenz von Bewegungskurven von 
M y zur Folge hat, die ein gegebenes festes Molekul von M y un- 
endlich oft in beiden Richtungen der Zeit betreten. Die Methode 
der Beweisfiihrung geht auf Poincar^ zuriick (loc. cit.). 

P 0 sei ein willkiirlicher Punkt von M r . In seiner unmittel- 
baren Umgebung gibt es dann einen Punkt P und einen Bogen 
PQ der Bewegungskurve durch P, wobei Q auch noch der Um¬ 
gebung angchort, wahrend das Zeitintervall zwischen P und Q 
nicht klein ist. Nun konstruiere man um P eine noch kleinere 
Umgebung so, daB sie in der gegebenen Umgebung von P 0 liegt 
und daB die entsprechende Umgebung um Q auch noch drinnen 
liegt. Da P nicht- wandernd war, laBt sich ein Punkt P 1 in 
kleineren Molekul um P finden, der in ihm schon friiher in einem 
Punkt R' war, wiihrend der Punkt in dem Molekul um Q , der 
P' entspricht, <?' genannt werden soil. So haben wir eine Be¬ 
wegungskurve R' P'Q' gefunden, die in dem gegebenen Molekul 
um P die Punkte R’, P', (g enthalt. P' und Q' liegen in den Um- 
gebungen von P bezw. Q. Analog konnen wir fortfahren und 
finden eine Bewegungskurve R" p" Q" 6’", wo P",4>",R" in den kleinen 
Umgebungen um P\ Q' und K liegen, wahrend S* in der gleichen 
Umgebung um P' liegt. Auf diesem Wege erhalten wir eine 
unendliche Folge von Bogen mit unbegrenzt wachsenden Zeitinter- 
vallen, so daB der Grenzpunkt P von P, P', P",... offenbar in der 
gegebenen Umgebung von P 0 liegt und eine Bewegungskurve be¬ 
sitzt, die die Umgebung in beiden Richtungen der Zeit t unend- 
lich oft betritt. In der Tat lauft diese Kurve — so ist sie ja 
defimert — unendlich oft in beiden Richtungen der Zeit aur un- 
mittelbaren Umgebung jedes ihrer Punkte von der Art des P 
zuriick. 

Kgl. Ge*. d. Wiw. Nachrichton. Math.-phy^. Kla«e. 1926. U©ft 1. 7 


291 



90 


George Birkhoff, 


Es ist klar, daft alle periodischen Bewegungen von 31 in 31, 
liegen, da diese ja dem relativ zu irgend einem 31 p nicht-wandernden 
Typus angehoren. Dasselbe gilt fiir Zustande des Gleicbgewichtes 
und jeder andern' Bewegung, die in die unmittelbare Umgebung 
einer ihrer Lagen gelangt, wie die „recurrenten Bewegungen", die 
ich an anderer Stelle definiert habe. 

Jede Bewegung von M loin nit in jcdc beliebige Umgebung dcr 
Mcnge dcr Zentralbewegungen 31,. mindcstens in jedem Zcitintcrvall T 
eininaJ, no T nnr von der Wald dcr Umgebung abluingt. 

Wir beginnen den Beweis mit der Bemerkung, daft jede Be- 
wegnng von 3I t gleichmaftig oft in jede beliebige Umgebung von 

kommt (Abschnitt 3). Da aufterdem jede Bewegung von 3/ t in 
jede beliebige Umgebung von 31, gleichmaftig oft (Abschnitt 5) 
kommt, so wird auch jeder Bogen PQ mit einem geniigend langen 
dazugehorigen Zeitintervall und mit einem 1\ der geniigend nahe 
an i)/, liegt, an 3I t beliebig nahe herankommen, daher kommt jede 
Bewegung von M gleichmaftig oft in jede beliebige Umgebung 
von M y 

Fahren wir auf dem Weg fort, so konnen wir diese Eigen- 
schaft nicht nur fiir 3I t , sondern auch fiir 3I„ 3/ <t ... nachweisen. 
Wenn die Folge bis M„ reicht, muft die Eigenschaft fur 31 gelten, 
als dem Limes von 31 wenn n gegen oo geht. Denn n kann so 
groft genommen werden, daft M„ in einer beliebigen Umgebung von 
M,., liegt. Daher gilt die Eigenschaft fiir 31,„ und ebenso fiir alle 
folgenden 31 p einseblieftlieh 31, 1 was wir zeigen wollten. 

Das obige Resultat ist charakteristisch fiir die Menge der 
„rekurrenten Bewegungen", die eine spezielle Klasse der Zentral- 
bewegungen bilden. Uber die Art der Annaberung der Bewegungen 
aus 31 an die Menge der Zentralbewegungen kann man eine viel 
scharfere Aussage machen, indem man die bekannten Tatsachen 
betreffs der Annaherung der Bewegungen von 31 p an 31,,^ fiir alle 
Werte von p kombiniert. Die so erhaltene Aussage wiirde kom- 
pliziert zu formulieren sein und von der Ordnungszahl der Menge 
3I p abhangen. Wirgeben uns damit zufrieden, ein etwas schwacheres 
Resultat nachzuweisen, das jedoch geniigt, um die Bedeutung der 
Zentralbewegungen in Evidenz zu setzen. 

Wenn die Zeitintervalle, fiir die ein sich bewegender Punkt, 
(der einen Bogen PQ einer Bewegungskurve beschreibt), in einem 
gegebenen Gebiet verweilt, die Summe x' haben, wahrend die Ge- 
samtzeit x ist, so werden wir sagen, die Wahrscheinlichkeit, daft 
der Punkt des Bogens in dieser Umgebung liegt, ist gegeben durch 
das Verhaltnis x'fr. 


292 



Uber gcwisse Zentralbewegungcn dyuamischcr System©. 


Die Wahrscheinlichkeit, daft cin Punkt dcs Doyens DQ einer Be- 
wegungskuree in M innerlndb einer will kit rlichen Umgebung dor dlcngc 
JJ r von Zentralbcwegttngcn liegt , konvergiert glcichmiiftig gegen 1, worn 
das zugchorigc Zcit inter call unbeyrenzt teach st. 

Wir werden diesc Tatsache beweisen, indem vvir die ent- 
spreebenden Bebauptungen aufstellen, die dadurch entstehen, dab 
man sukzessive JL, durcb Jl,, Jl t ... ersetzt. 

Offenbar gilt die Behauptung fur .1/,, da ja das gesamte Zeit¬ 
intervall, in dera irgend eine Bewegungskurve von M auGerhalb 
einer gegebenen Umgebung von JL, liegt, gleichmafiig begrenzt ist. 

Gehen wir nun zu der entspreebenden Bebauptung fur dl 7 
iiber. Hier werden wir das eben fur Jl 7, dargelegte Resultat zu- 
sammen mit der Tatsache verwenden, daB das gesamte Zeitinter- 
vall, in dem eine Bewegungskurve von u/, auBerhalb einer gegebenen 
Umgebung von M t liegt, gleicbmaBig begrenzt ist. 

Nun wird jeder Punkt von Jl, der einem Punkt von JL, nahe 
genug liegt, diesem Punkt von 717, beliebig lange gleicbmaBig nahe 
bleiben. Daraus schlieBen wir, daB eine Umgebung von Jl, so 
klein gewahlt werden kann, dafi jeder Punkt in ihrem Innern An- 
angspunkt eines Bogens PQ einer Bewegungskurve ist, der ei nem 
vorgeschriebenen grofien Zeitintcrvall T entspricht. Dieter Bogen 
iy liegt im Innern der gegebenen Umgebung von Jl t , ausgenommen 
fur ein Zeitmterval r, das im voraus unabhangig von T fest- 
gelegt ist. 


Die Wahrscheinlichkeit aber, daB ein Punkt eines Bogens 1IS, 
der einem genugend langen Zeitintervall V entspricht, im Innern 
gewahlten Umgebung von ,1/, liegt, iibersteigt 1 - * ( £ > 0, 
willkurlicb). Jeder Punkt von US im Innern dieser Umgebung 
ist Anfangspunkt ernes Bogens /'<?, der dem Zeitintervall T ent 
spricht und der gut im Innern der gegebenen Umgebung von M, 
liegt, mit Ausnahme hochstens wabrend eines Intervals r, das im 
voraus festgelegt ist. 


Diese Sachlage macht es von selbst klar, daB die aufr-estellte 
Behauptung fur M, richtig ist. ges 

Nun bedienen wir uns genau derselben Methode, urn dies fur 
... zu zeigen. 

Tat Z eDU ^ fa - lV ' ''' gilt ’ S ° g ' U eS auch fiir In der 

" ma “ e ‘“ S ° findeD ' daB CS in J eder vorgeschriebenen 

Punkt eines Ta “ r 8 *’ ^ die Wahrs ^einlichkeit, daB ein 

Punkt ernes langen Bogens PQ in der Umgebung von M Wt 

SSt n,h ’ der E " heit a*»ib.'ia*£: 


293 



92 G. Birkhoff, Uber gewisse Zentralbewegungen dynamischer Systemc. 

Dieser ProzeB kann unbegrenzt fortgesetzt werden und zeigt 
schliefilich, daB M, die gewiinschte Eigenschaft besitzt. 

6. Bcmerkung iibcr Atom-Dynamik. 

Die Resultate fiihren zu einer interessanten Bemerkung, die 
die Atom-Dynamik betrifft. Angenommen, ein Atomsystem wiirde 
als dynamisches System in dem Sinn aufgefaBt, daB die augen- 
blickliche Lage durch n Koordinaten x t ... x n festgelegt ist. Ware 
das dynamische System vom gewohnlichen klassischen Typus, dann 
wiirden die Zentralbewegungen alle mbglichen Bewegungen aus- 
machen und waren alle gleichbedeutend. Daher wiirde man bei 
Storung und Strahlung der elektromagnetischen Energie ein kon- 
tinuierlichcs Spektrum erwarten. Wenn andererseits die Zentral¬ 
bewegungen nur aus periodischen Bewegungen und Gleichgewichts- 
zustanden bestehen, dann ist die Wahrscheinlichkeit, daB eine 
willkiirliche Bewegung sich in der Umgebung einer dieser Zentral¬ 
bewegungen befindet, gleich eins. Bei leichteren Storungen wiirden 
wir Strahlungen mit den Perioden dieser zentralen periodischen 
Bewegungen erwarten. Bei scbwereren Storungen wiirde die Be¬ 
wegung von einer festen Periode zu einer anderen variieren. Diese 
Resultate scheinen in einer allgemeinen Weise mit den Tatsachen 
der Strahlung in Einklang zu stehen. Folglich scheint die einzige 
Moglichkeit einer dynamischen Auff’assung eines Atomsystems in 
der Verwendung eines nicht-klassischen Typus von Gleichungen 
zu liegen, fur den die Zentralbewegungen aus periodischen Be¬ 
wegungen und aus Gleichgewichtslagen bestehen. 


294 



Reprinted from Amer. Jour. Math., January 1927, Vol. 49, p. 1-38. 


Stability and the Equations of Dynamics.* 

By Geobge D. Biekhoff. 


Introduction. 

For a long time attention has been directed towards the equations of 
dynamics, and more and more there has been the tendency to take these as 
being derivable from a special type of variational principle and as consequently 
possessing * Hamiltonian ' or ‘ canonical' form. In consequence of the fact 
that the mathematical treatment of these equations involves difficulties due 
to the lack of flexibility of the underlying group of transformations, a very 
large formal literature has grown up. On the other hand, physicists, realizing 
the sufficiency of such equations for most applications and impressed with the 
purely mathematical developments, have yielded a position of the very first 
importance to them, just as have the mathematicians because of the physical 
applications. 

For these reasons there has been a disposition to attribute an almost 
mysterious significance to the Hamiltonian equations. 

Now there is one property of these equations, first explicitly and ade¬ 
quately recognized by Poincar*, which is of obvious significance. If there is 
given an equilibrium position or a periodic motion of a dynamical system of 
this type, which satisfies the simplest necessary conditions for stability, and 
if there are no relations of commensurability between the underlying periods, 
then there will be complete formal trigonometric stability. This fact is of 
course of great physical interest, and throws great light upon the inward 
nature of the stability existing in the solar system for instance. 

It is the main contention of this paper that essentially the only signi¬ 
ficance of the Hamiltonian (and thus the Lagrangian) equations is that they 
possess this characteristic property of complete formal stability. In other 
words, any set of n equations possessing complete formal stability at an equi¬ 
librium point, for example, may be given Hamiltonian form by appropriate 
formal change of variables. 

In the pres ent paper attention is primarily directed to the case of a 

* f" “ prclimin,,r - v no,e in ‘he Conpu, rendu, for September 20, 1926. The 

Phnode,phia ° f ,he 

i 


295 



2 Birkhoff: Stability and the Equations of Dynamics. 

system of equations of oven order, although it is proved that the odd order 
case essentially reduces to that of even order. Furthermore, the treatment 
is restricted to the typical case of the equilibrium point, although I expect to 
treat the analogous important case of a periodic motion elsewhere, in so far 
as the generalization is not an obvious one. 

The precise nature of the Hamiltonian equations is not adequately ex¬ 
pressed in the mere statement that the property of stability is their one 
characteristic feature. Indeed, except from a purely formal point of view, 
these equations form a particular kind of completely stable systems, because 
of their convergence properties. The interrelation between the various kinds 
of stability of systems of ordinary differential equations, the nature of the 
trigonometric series employed and their degree of accuracy, the formal reduc¬ 
tion to Hamiltonian or PfafTian form in the case of complete stability, must 
be studied in detail before the very interesting situation which exists can be 
rightly comprehended. It is my purpose in this paper to clarify this situation 
as far as possible. 

Many other problems are immediately suggested. I will only refer to one. 
In recent work on the theory of relativity there have been notable attempts 
made by Eddington, Einstein, Hilbert and Weyl to derive the general partial 
differential equations of physics from a * Hamiltonian principle/ There arises 
now the question: Is the significance of the principle the same here as in 
classical dynamics, namely to insure the complete trigonometric stability of 
equilibrium or periodic motion? 

In concluding these introductory remarks, I wish to express the hope 
that these important and interesting questions will receive the further atten¬ 
tion which they certainly deserve. In the present paper only a beginning 
is made. 

§ 1. Definition of Stability. 

We propose to consider a system of n ordinary differential equations of 
the first order 

(!) dxi/dt =- Xi(x lt • • •, x n ) (t—1, •••,«), 

where the right-hand members are real analytic functions of z,, • • •, x n in 
the open connected continuitm under consideration. In particular we shall 
study the properties of the solutions of (1) in the neighborhood of an 'equi¬ 
librium point 9 of this continuum, i. e., a point for which all of the functions 


296 



Bibkhoff: Stability and the Equations of Dynamics. 


3 


Xi vanish. It is especially questions concerning the stability of nearby 
motions that will occcupy our attention. 

There are two quite distinct properties of stability of equilibrium: the 
first is that the nearby motions must continue near to equilibrium for a cer¬ 
tain time; the second is that these nearby motions have a quasi-periodic char¬ 
acter. We propose to incorporate both of these conditions in our definition 
of stability. 

Definition of Stability. An equilibrium point of the differential system 
( 1 ), taken at the origin in x lt • • •, x m space, will be said to be stable of the 
mth order if it possesses the two following properties: (A) all points lying 
at a distance « from the origin at time t 0 remain at a distance not exceeding 
Kt for an interval | t — t 0 | ^ Lt~ m ( K , L, definite positive constants) ; (B) 
for any fixed interval | t — t 0 | ^ T the coordinates *„•••,*„ of any such 
motion and polynomials with lowest terms of degree 5 > 0 in the coordinates 
can be represented by trigonometric sums 


A o 4- 2 (At cos Pi t + Bi sin p t t) (| Pi — pi | ^ P > 0) 

of not more than N -f 1 terms with an error not exceeding ( P, N, Q, 

definite positive constants) during the fixed interval. 

If the equilibrium point is stable of all orders it will be said to be com¬ 
pletely stable. 

It is obvious that stability of any order necessitates stability of all lower 
orders. 

The equations ( 1 ) are invariant in form under any point transformation 

— «M5„ * * *-) (t— 1 , • • n ) 


where the functions are analytic with | 9*,/3z, | ^ 0 at the transformed 
equilibrium point. The order of stability is clearly invariant under such a 
change of variables. In employing these transformations we shall always 
take the transformed point to be at the origin also, so that *< are given by 
power series in x u • • •, x n without constant terms. 


It will be one of our principal purposes to show the relation between these 
properties (A) and (B), and we shall find that ‘perturbative stability' (A) 
does not necessitate ‘trigonometric stability' (B), though it turns out that 
the requirement of trigonometric stability suffices to insure the stability of 
perturbations We find it convenient first to go as far as possible on the basis 
property (A) alone, and then employ the more fundamental require- 


297 



4 


Bibkhoff: Stability and the Equations of Dynamics 


It is extremely instructive to consider a few characteristic examples in 
this connection. 

The simplest possible example of a point of equilibrium is furnished by 
the equation 

dx/dt =- ax, 

in which the point lies at ar—0. Here the exponential form of solution 
shows that there can not be first order stability even in the sense (A). 

A more instructive example is furnished by the equations 

dx/dt = y + \x(x‘ + y-) t dy/dt - x + $y(x 2 + y 2 ) 

with equilibrium point at (0, 0). The general solution is easily determined 
to be 

r o COS(g 0 -f t — t 0 ) r o sin(0 o -M — <o) 

* “ VI — — (.) ' y Vl-r.’lf-i.) 

where r 0 , 0 O are the polar coordinates of ( x,y) at t 0 . Thus r 0 is the distance 
t which enters into the above definition of stability. 

In this case the expressions for x and y remain of the order of < for 
I t to I ^ Lc- (0 < L < 1), so that the equilibrium point is stable of the 
second order in the sense (A). But since the denominator in the expressions 
for x and y vanishes for / — / 0 — it is obvious that the point is not stable 
of the third order. The approximate trigonometric form of the solutions with 
an error of order 3 in e for any fixed interval | t — U | ^ T shows that the 
situation is the same for stability in the sense (B). 

On the other hand, for the equations 

dx/dt — y + y (x 3 + y 3 ), dy/dt — — a: — x(x' + y 3 ), 

the general solution is readily found to be 

(x = x 0 cos a(t — t 0 ) —y o sina(t — t 0 ) (a — 1 + x 0 2 + y 0 2 ), 
(y — sinu(< — to) -f y 0 eosa(t — t Q ) . 

Hence the origin is completely stable in this case in the full sense (A), (B). 

All of the preceding Examples seem to indicate that trigonometric sta¬ 
bility is invariably associated with perturbative stability. The following 
example is of the simplest type which illustrates that this is not the case. 


298 



Bibkhoff: Stability and the Equations of Dynamics. o 

We 6hall take n = 4 with two pairs of variables x x , y lt x 2> y 2 , and the equations 

( dxjdt -+ dyjdt — Pl x x + ay x (x t - -f- y 2 *), 

j dxj/dt — — p*y 2 + £r,(* a * -f y, 2 ), dy,/d/ = p 2 :r 2 -f £y,(:r & * -f yi 2 ), 

with a, /? of opposite signs. For convenience in dealing with the equations we 
shall introduce conjugate variables 

u * — xt +y«V— 1 , v t *=Xi —yiV—1 (*— 1, 2), 

and write the equations in the form 


f d “‘/^ “ (P.V— 1 + au-t>,)u„ dv,/dt — (— p, V— 1 + a u.v 3 )v„ 
X du,/dt — (p,V— 1 + — (— p.V—T + /3u,v,)v„ 

where a, £ are real constants with a/3 < 0. It is to he noted that the equations 
themselves are of conjugate ty pe: th at is, when, for instance u, and v, are 
interchanged throughout and V— I changed to — V=a in the first pair of 
equations, the equations are unaltered as a set. 

Now multiply the first and second equations of the set just written by 
«„ «, respectively and add; operate similarly with the second set. There is 
obtained the pair of equations 


dwx/dt — 2aWitv a , dw 3 /dt — 2 fiw x w tl 

where w — m.v, and u / 2 — m 2 v 2 are positive or zero for all real solutions t/, f 
y 2 of course. 

If w t , for instance, vanishes for any t, these equations show that it will 
vamsh identically, with a constant.* This gives u, _ _ 0, and the 

differential equations in u„ v, enable us to write the corresponding solution 


u x — 0 , Vl — 0 , u 2 *= M 2 «°»eV-1 Vi M 

Consequently, for this type of solution the squared distance from the origin, 
measured by to, + <v, ,s constant, and the representation is trigonometric. 

t - f el that U8 C ° nS the eCneraI SO ‘ Uti0n 8t diSt " nCC ‘ frora * he or, 'S in «“ 


+ ^ 2 <0) — e 2 , «»,«•>> 0, 


W, (0i > 0 , 




299 



6 


Birkhoff: Stability and the Equations of Dynamics. 


thus laying aside the possibility just considered. Introduce a parameter r 
for which dr «= 2wiW 2 dt. The equations for w ly w 2 show that we have 

U>1 — a (r — To) + U»» <0> , W 2 = P(t -To) + VJ 2 <0 \ 

so that w x and w 2 vary monotonically and in opposite senses. 

Because of the integral relation 

pw x — aw 2 — — aw 2 (0i = const., 

in which p and a are of opposite sign by hypothesis, the positive quantities 
Wi, w 2 , and consequently t also, vary in a finite interval as t passes from 
- 00 to + 00 . 

Let us observe the relation between t and t further. Taking f 0 — t 0 — 0 
for simplicity of notation, we find 


21 



_ dr _ 

(ar + W^^ipT + wS 0 ') 


Here the lower limit of integration lies between the two zeros of the denomi¬ 
nator of the integrand. Thus t ranges between these values t' and t" only. 

The integral relation above demonstrates that w t and w 2 remain less than 
a fixed multiple of < for all time. Hence we have stability of every order m, 
in the sense (A). 

On the other hand we have from the original equations 

( dujdt — (p, V— 1 + aw 2 )u lt dvi/dt — (— p t y/ — 1 -f aw,)v lf 
\ du 2 /dt — (p»V—i + pw x )%h, dvt/dt — (— p 2 y/~l + pw t )v 2t 


whence we obtain by integration 


f Ux — u I < 0, «*V-i r * tt /»■«, v t — **“ /**«, 

( u 2 — u 2 <0, c p * v ”*' v 2 — r .0 

Consequently the coordinates u,, t’„ u 2 , v 2 , have amplitudes which vary mono¬ 
tonically, but they are essentially trigonometric in the angle variables. 

It is obvious (se<> § 3) that x lf y u x 2 , y 2 and u lf v„ u 2 , v 2 cannot be repre¬ 
sentable by trigonometric sums as demanded with an error o| the seooatf' 
order in e. Consequently property (B) fails to hold for m — -li 

Inasmuch as we are not following out very systematically the conditions 


300 



Birkhoff: Stability and the Equations of Dynamics. 


7 


for stability of type (A), it may be remarked at this juncture that in general 
the property (A) will follow if there is an integral in which the initial terms 
give a definite quadratic form, just as they did in the above example. One is 
naturally led to conjecture that, in general, condition (A) can only obtain for 
some order m if there exists a polynomial P in the variables x it • • *, x n with 
initial terms yielding a homogeneous form of degree k of definite sign, such 
that the series on the right in the equation 


dP/dt = 2 dP/dxj X y 

will begin with terms of degree m -f k at lowest. At any rate the type of 
argument employed in the following paragraph shows readily that this con¬ 
dition is sufficient for (A).* 

Another example illustrating the^possibility that (A) may be true for^ 
m^ \ without (B) holding for m —* is furnished by the equations 

dx/dt — xy\ dy/dt — — x'y. 

Here *• + y’ - const, is an integral, so that (A) holds for any order m. 

The general solution is 


x =*= -j_ c 

- (i + cV'C,,-.,), * y- 

which is not trigonometric in type. However, here the origin is not an iso¬ 
lated point of equilibrium, since all of the points of * _ 0 or y - 0 are of 

this type. On account of its degenerate character the example is not as illu- 
minating as the preceding one. 


§ 2. First Order Stability. 

The condition (A) for stability has reference to solutions of the given 

deS n T 'T 0) , lyi " g f ° r at a dist ' ln “ * deso than some 

definite quantity) from the equilibrium point, which will always be taken to 

be at the origin . Now suppose that we write * _ ty „ thus making a linear 


• In this connection tho reader 
chaps. 8, 9, where the fundamental 
also cited. 


may be referred to Picard. Trait* d\Analyse, Vol. 3, 
papers of Poincare, Liapounoff and Levi-Civita are 


301 



8 Bibkhoff: Stability and the Equations of Dynamics. 

transformation of the dependent variables. The equations (1) then take the 
form 

dy t /dt = l/« X,(ey lt • • •, e y n ) «— Y <(y„ • * *, y*> c) (i — 1, * • •, n). 

Since the Xt f s vanish at the origin in • • *, x„ space, Yi will be given as 
power series in y lt • • •, y n , c without constant term. Moreover the domain 
of uniform convergence of these new scries will certainly extend beyond a 
distance unity from the origin in y„ • • •, y n space if « is sufficiently small. 

Corresponding to any solution of (1) at distance « from the origin to the 
x x , • • •, x n space, we have a unique solution y t of the transformed equations 
at distance unity from the origin in y,, • • •, y n space. Let us write y 4 — a 4 
at t — t 0 and regard these initial values at as fixed, while « is a parameter 
which approaches 0. There will be defined a set of solutions yt varying ana¬ 
lytically with «, according to the fundamental existence theorem for ordinary 
differential equations containing a parameter. For « — 0 the set of functions 
will satisfy the corresponding equations of variation 


( 2 ) 


JMS _ ± 9 *± y. 

dt jt x dx, </' 


obtained by setting < — 0 in the differential equations for yt. These are n 
ordinary linear differential equations with constant coefficients. 

Suppose now that we make th.e hypothesis that the equilibrium point of 
(1) under consideration is stable in the sense of (A) for m — 1. It follows 
that yt — xt/t is uniformly bounded for | t — < 0 | of order c" 1 . Consequently, 
by allowing < to approach 0, we infer that the limiting functions yt are 
bounded for all t. Thus for stability of type (A), m — 1, it is proved that 
the equations of variation (2) have all such solutions yt bounded for all t. 
But the most general solution of (2) is derived from solutions of this type 
by multiplication by a constant factor, since the only restriction made was 
that the sum of the squares a x , • • •, a„ is 1. Hence it is a necessary condi¬ 
tion for stability of type (A), m — 1, that every solution of the equations of 
variations (2) be bounded for all t. 

The known nature of the solutions of (2) enables us immediately to 
determine the precise significance of this condition. 

Suppose first tha^ the roots a of the characteristic equation 

~- 4 | o — a Stf |=0 (Stj = 0, i ;; 8 {J =» 1) 


302 



9 


Birkhoff: Stability and the Equations of Dynamics. 

are distinct from one another. Then there are n particular solutions of the 
equations of variation which may be written in the form 

y. — cw* a ", • • •, y» — c m/ eF>* (/— 1, • • •, n). 

The most general solution is obtained by linear combination of these particular 
solutions. Here a„ • • •, a* may bo real or complex of course, but correspond¬ 
ing real solutions may be obtained by combination from the pair of solutions 
corresponding to a complex root and its conjugate. It is apparent that such 
solutions are bounded only in case the expressions c a '* are bounded. Hence 
all of the ai's must be pure imaginary quantities or zero. In this way we 
see that in the general case of distinct roots of the characteristic equation, 
it is necessary that the roots occur in pure imaginary pairs except for possible 
zero roots. 

On the other hand in the more complicated case where the roots of the 
characteristic equation are not distinct, there may arise degenerate solutions 
not of the above type, but of the form 


x,—c„t •, x. — c.,t><e‘" (7, >0), 

corresponding to the case in which the elementary divisors corresponding to a 
multiple root a, are not simple. Evidently these are not bounded. In this 
way we are led to the following first conclusion: * 

that F on f a l ° rd ! r °J tSPt {A) atan voint it is necessary 

that all the roots of the characteristic equation belonging to the point be 

grouped m pure imaginary pairs except for possible zero roots, and that the 

elementary divisors of the corresponding „ matrix be distinct. 

is suffiln? le r d t at ° nCe , t0 8Sk Wheth * r ° r DOt ,his nece *“0- condition for (A) 
is sufficient. It is easily proved to be sufficient as follows. 

reduce th S “T lin “ r ' rnnsf0rma,i0n of <>> (i» general complex) which 

r iriTsSi ~ —- 

conjugate which correspond to conjugate roots nn ,> f , * 

equation. We shall take nains ‘ \ ‘ a/ ° f thc chnr «^istic 

* The main facts here are of course well known. 


303 



10 


Bibkhoff: Stability and the Equations of Dynamics. 


ables the squared ' distance 9 from the equilibrium point is properly measured 
by the sum of the squares of the quantities | a;* |. 

The transformed equations take the form in the new variables 

(3 X ) dxi/dt = aiii -f- X< (t=l, • • •, n ), 

where the power series X t in the variables x l9 • , x n begin with terms 

of the second degree, while the a t ’s are either pure imaginaries or zero. The 
series X«, X/ corresponding to conjugate variables x lt xj will be conjugate of 
course, while the series X« corresponding to zero roots of the characteristic 
equation will be real. If we make the further change of dependent variables 

Xi = e^'Zi (» — 1, • * *, n), 

we find the equations for to be 

dzt/dt — Xiie^'Zi,- • •, e a - f z„). 

Since the exponential factors occurring here have absolute value 1, we have 
clearly for some constant X 

|X«|£X(|s a |' + " • . + |*,.|’)-Xu’, 

so long as y lt • • •, y n or • • •, *„ are sufficiently 6mall in absolute value. 
Hence we have 



t, d -X^ | S (|*i I+•■ • + !*.!) Xu*S VnXu*; 


the elementary inequality for real quantities b lf • • •, b n , 


(M-* . . + & n )’^ i | bibf | ^ £ i( 6 i*+ V)-n £ »/• 

«•/*» t.J'l /-l 

is employed in deducing this result, as well as the obvious relation 


d\*i 

dt 


ir I — x “*- 


Since u^O we infer 


— V n Xdt ^ du/u 7 g +Vn Xdt, 
whence by integration 

— VV X(t — t 0 ) ^ 1/uo — 1/u ^ VV X(t — to) (t > to) 

with a like inequality for t < to. 


304 



Bibkhoff: Stability and the Equations of Dynamics. 


11 


But at t =* t 0 we have u — e of course. It appears then that u will not 
increase to 21 or diminish to within an interval 




=5v^nr • 

This is what we desired to prove. 

The stated necessary condition is sufficient for first order stability of 
type {A). 


A slight extension of the preceding proof leads us to the fact that if this 
condition (A) is satisfied, so that the general solution of the equations of 
variation is composed of trigonometric sums, then (B) will hold for m — 1, 
i. e., the coordinates can be suitably approximated to by such sums. 

Within the same t interval we have | X, | ^ 4Xuo* so that from the differ¬ 
ential system in z„ • • •, z„, the inequalities 

|*«— *«<•» |^4Xtio*|< — *o| 
result by integration, or, in the variables x it 


| »-*•> | ^ 4Xuo 8 1 < — 

In other words the n trigonometric sums which constitute the solution of the 
equations of variation with the same initial values as *« at t 0 represent s, 
with an^ error not exceeding 4X<* \t — t 0 \ during an interval | t — 1 0 | ^ 
r*/2VnX. A like property must then hold for the solutions of (1) which 
are related to the solutions of the prepared equations (3x) by a linear trans¬ 
formation with constant coefficients. Hence we may state the following result: 

The stated necessary condition for first order stability is sufficient. If 
then we let ' 


Cue a,t , • • *, c m ,e a " 




> n) 


be n linearly independent solutions of the equations of variation ( 2 ) so that 
a, are pure imageries or zero and the determinant | c„ | * 0, and if we 
form the unique linear combination of these solutions taking on the same 

o/ w ~ 

A ° + 2 (At cos Pit 4- By sin p,t) 

/=! 

where A,, B, are of the first order in and represent x, with an error not 


305 


12 


BlRKHOFF: Stability and the Equations of Dynamics. 


exceeding AV | t — t 0 | during an interval | t — t 0 | = Ac -1 (K, L, definite 
positive constants). Here p t , •••,/>, are the s absolute values of the 2 s 
pure imaginary quantities ai, so that 2s n. 

§ 3. Supplementary Results for First Order Stability. 

It was stated in § 1 that the condition (B) for stability necessitated the 
condition (A). We propose at first to establish the fact in the case of first 
order stability. In order to do so we prove a lemma concerning trigonometric 
sums of the type referred to in the statement of (B). 

Lemma Concerning Trigonometric Sums. If n sequence of trigono¬ 
metric sums 

Ao + 2 {A, cos p,t -f B,s\np,t) (I Pi — Pt\ ^ P > 0) 
y»i 

where P, N are fixed positive constants, approaches a limiting function 0 (f) 
uniformly in the fixed interval | t — /„ | ^ T then 0 is itself a trigonometric 
sum of the same kind. 

Proof. The lemma may be proved by induction. Let us first establish 
it in the case N — 1 , and let us take t 0 — 0. Here it is given that 

0(0 — lim {A m "> + (/l/” cos/>,<”* + sin/>*<”*)] 

1*00 

where the limit is approached uniformly by the sequence 0 i(O* 

Now the double integration of the term of trigonometric type here occur¬ 
ring simply divides it by — p, <ns . .Hence the sequence of functions 

/o'/o' 0i(O tf ' 5 + ‘M0/P. ,m 

consists of polynomials of the second degree in t. 

Suppose now that p, ,n has some limit p t (| p t | ^ P since p 0 ' n — 0) 
other than co as l becomes infinite. Since the uniform limit of a polynomial 
of any degree can only be a polynomial of that degree, it is necessary that 

/o'/o' 0 (O <** 2 + *(O/p . 2 

is a polynomial of the second degree in t. Consequently, by double differen¬ 
tiation, we derive 

d 2 4 >/dt 2 -f- pi 2 <f> = const., 

so that <f> has the form 

<*> = + (A, cos -f- B x sin p t l ) (| Pi | ^ P) 


306 



Birkhoff: Stability and the Equations of Dynamics. 


13 


in agreement with the lemma. 

If, however, p/ n has only the limit c© we conclude immediately by a 
similar method 

<f> = A 0 , 

in agreement with the lemma. 

We remark also that if additional polynomials of the second degree are 
added to the trigonometric sums, K — 1, a similar proof shows that the limit 
<f>(t) necessarily consists of such a sum augmented by a polynomial of the 
second degree. 

To prove the lemma for N — 2 we need only isolate one of the two trigo^ 
nometric terms with period 2n/p 2 (t) in the approximating trigonometric 
sums. Proceeding as above we are led to consider 

This expression lacks the trigonometric terms of the type isolated so that if 
p 2 tn has a finite limit p 2 , 

JVJV*(0 dt* + <f>(t)/p 2 * 

can be approached uniformly by trigonometric sums of the above type of order 
W —-1 augmented by polynomials of the second degree, and so is itself simi¬ 
larly constituted by the previous remark. By double differentiation, we con¬ 
clude that <f> is a trigonometric sum of the second order and of the stated type 
in this case. When p. ,,t has only the limit oo the method can be carried to 
a conclusion ns before. 

Evidently we have here the basis of a method of proof by means of induc¬ 
tion for iV — 1, 2, • • •. Hence the lemma must hold for all values of X. 

We may prove that the condition (B) implies (A) for m — 1 by making 
use of this lemma. By hypothesis x t /c can be approximated to bv a trigo¬ 
nometric sum of the prescribed type in any fixed interval | / — 1 0 | ^ T with 
an error of the first order at most in c. But by choosing x t properly, x,/t 
can be made to approach any solution y, of the equations of variation having 
1. By the lemma then since the functions y< can be 
approached uniformly by such trigonometric sums they must be such sums 
themselves. This means of course that the general solution of the equation of 
variation is trigonometric, which was seen to be a necessary and sufficient 
condition for stability of the first order. 

It follows that the condition (B) alone in the case m -= 1 implies con- 
dition (A), and thus stability of the first order. 


307 



14 


Birkhoff: Stability and the Equations of Dynamics. 

The second result to which I wish to call attention is implicitly contained 
in the results for trigonometric sums specified in § 2 and has to do with certain 
amplitudes. Let us formulate the simple notions involved. 

The amplitudes are defined to be the absolute values of the variables x t 
under consideration, taken as a solution of the prepared equations (3,). If 
there are s pairs of conjugate imaginary roots a< and k zero roots (n = 

2s + k) then there are only the amplitudes 

Ci, • • •, C$, C M , ' ■ ■> £"«•* 

since conjugate imaginary Xt's have the same absolute value. 

It is a fundamental property of the s+k amplitudes that the sum 

Ci 7 + • * * C..k 7 

serves to measure essentially the squared distance of a point in the original 
*,»•*■>*« space from the position of equilibrium. In fact the linear trans¬ 
formation relating (1) and (3.) evidently takes the above sum into a positive 
definite quadratic form in the values of r„ • * •, x». Any such quadratic 
form serves to measure the squared distance from the origin as well as the 
sum of the squares used earlier. 

The fact which we desire to establish directly may now be formulated in 
the following way: 

In the case of first order stability the s + k amplitude variables C\, • • •, 
C t *k change only slightly in comparison with the initial total amplitude, 

« -(CV + - • - + C.. k 7 )* (<-M 

during a time interval | t — 1 0 | ^ LV> where L* is to be suitably chosen. 
For conjugate x t and x t the prepared form (3,) gives 

d(x t X/)/dt -= XiXj -1- XjXi, 

while for real x { we have 

dxi 7 /dt — 2 x t Xi. 

Hence if we write u 2 for the squared amplitude and recall that X< commence 
with terms of at least the second degree, we find, as in § 2, 

du/dt ^ Xu 2 , 

where X is an appropriately chosen positive constant. Now in § 2 it was 
shown how from such an inequality we may infer that u will vary between 


308 



15 


Birkiioff: Stability and the Equations of Dynamics. 

and 2 u 0 in a suitably taken interval | t — 1 0 \ =. Lt~ l . The same argu¬ 
ment permits one to infer that u will vary between (1 — r, )u„ and (1 -f- y)u 0 > 
rj arbitrarily small, if L is chosen small enough. In other words the 6um of 
the squares of the individual amplitudes only varies slightly a6 compared with 
the total initial amplitude under these conditions. 

But the variables x„ • • •, x n of the prepared equations (3,) are not 
uniquely determined, for we may replace x lt • • •, x» respectively by CiX,, 
• • •, CnXn respectively, where c«, • • •, c m are arbitrary positive constants ex¬ 
cept that ci and cj associated with a conjugate pair xi and xj are to he equal. 
In so doing the definition of the amplitudes is changed in that these are 
replaced by arbitrary constant multiples of themselves. Thus we conclude 
that, for any arbitrary set of s -f- k positive constants A,, • • •, Athe sum 

AiCi 2 ■+*••• + A k*$Ck*» i 

changes only varies slightly as compared with its initial value for an interval 
of time of order c" 1 . 

Now choose s -f k linearly independent sets A„ • • •, A,** and an arbitrary 
small value of rj > 0. It will then be possible to choose for each correspond¬ 
ing sum and therefore for all, a constant L* so small that the ratio of the 
corresponding sum of squares to the initial sum t/ 0 2 at t — t 0 will vary as little 
as may be desired for | t — 1 0 | ^ L*c l . 

The theory of linear algebraic equations shows then at once that all of the 
amplitudes can vary only slightly as compared with the initial total ampli¬ 
tude, as first measured. This yields the result stated. 

It will be proved in the following section that when the amplitudes are 
more carefully defined in the case of first order stability*, n even, the variation 
in these amplitudes remains slight lor an interval of time, | t — 1 0 | ^ L*c 2 . 

§ 4. Second Order Stability. 

In dealing with second and higher orders of stability, it is convenient to 
treat separately the cases when the order of the differential system (1) is even 
( n — 2s ) and odd (n 2s -f 1). Furthermore we have now to take account of 
commensurability relations between the characteristic frequencies p</2»r. 

For the case of even order we confine attention to the case where the 
inequality 

l iPx + * * • t ,p, 7^ 0 


309 



1G Birkhoff: Stability and the Equations of Dynamics. 

holds for sets of integers £„•••, U not all zero. This inequality excludes 
in particular any zero or equal roots of the characteristic equation. For the 
purpose of the discussion of second order stability here made, only these 
special restrictions arc necessary. 

In the case of odd order there is always one zero root which wc take 
to belong to the coordinate x n ; we designate the 2s other pure imaginary or 
zero roots as in the even order case, and again make the assumption stated 
above, although oniy the case of further zero roots or of equal roots needs to be 
excluded as far as the argument of this section is concerned. 

It is with the intention of avoiding undue complexity that restriction to 
the * general case * is thus made. 

We take up first the case n — 2s, and suppose that pairs of conjugate 
variables are called 

— $i> ** “ lx* * ' '» *■-» — x " — 

The prepared form of equations (3,) may then be written 

(3) d$i/dt — pi V —1 h + *<» dru/dt — — pi V —I v* + 

<i— 1, • • •>*) 

where «J><, contain no terms of lower than the second degree. 

Let us attempt to proceed with the preparation of these second degree 
terms by means of a further transformation of the form 

£< —+ Fit Vi V* + * ‘ ’ * s )> 

where both of the functions F t , G t are conjugate homogeneous polynomials 
of the second degree in • • •, v; so that if h , 7,t are conjugate imaginaries, 
rj, (£— l, • • •, s) will be, and conversely. The inverse transformation 
has the form 

$i — £« — Fi -}-■■* > Vi — * 7 * — Qi ■+■' ‘ ‘ (»= ‘ » s )» 

where Ft, G t are the same polynomials in v* as Fi, Gi in £i, rji> an< * where 
higher order terms are only indicated. Such a transformation has already 
been observed not to affect the order of stability (§1). 

On making this change of variables in the prepared equations (3), we 
obtain the specific modified equations 


310 



Birkhoff: Stability and the Equations of Dynamics. 


17 


(4) 


— piV — 1 fi + *« ,,, +p.V— lFi 

* -/ dF, c lFi \ 

^ ++,•=• — P 'V~1G, 

-% piW ~( (i w! £r) + " 


where dashes over the £*Vand ly/s have been dropped, and where *i <2 \ ♦i*”’ 
are the second degree components of •*»<, 'I', respectively. The displayed parts 
contain the complete second degree term in the new equations. Hence this 
change of variables does not affect the character of the prepared form. 

If we let ci stand for the complete coefficient of a definite term of the 
second degree 

Clt lS»*" • • • t.' fl**' (/> + IM| + • • • + Ml, = 2) 


in <&i, while qi is the analogous coefficient in Fi , then the corresponding coeffi¬ 
cient in the second degree term of the first equation above is 


Ci — [Pi(l i — mi,) -f • • • -f- (fi — mi — l)pi + • • • 

+ (** — ”••)!>*] V— 1 qt- 

It is at this point that the hypothesis concerning the general case enters. 
The coefficient of qi cannot vanish unless 


fi “ mi,, • • •, /i —■ mii -f- 1, • • •, — m ty 

Now the argument just stated has not made use of the fact that we are 
dealing with the simple case 7, -f- m, +••• + „,. — 2 when there is evi¬ 
dently no possibility for the coefficient of q, to vanish. Hence we conclude 
that, without any assumption ot second order stability, we mav further pre¬ 
pare the equations (3) so that no terms *,<=* remain. 

We may now prove the following result: 

In the general case of first order stability, n — 2s, there mill also be 
second order stability. Let the combined linear and quadratic transformation 
from Vl to 7,< be made which takes (1) into the prepared form (3) lack¬ 
ing second degree terms in $>,, If wc substitute 


for lu rji VI the formulas, the trigonometric sums obtained will represent 


311 



18 


Bibkhoff: Stability and the Equations of Dynamics. 

x,, with an error not exceeding iV | t — to \ during an interval of 
time | t — t 0 \^ Lt~ 2 - 

The details of the proof are very much like those for the analogous result 
concerning first order stability (§2). To begin with, we note that in the 
prepared form the non-linear terms on the right are of at least the third 

degree. 

Now, using exactly the argument of § 2 except that we can assume 
| X, | ^ Xu 1 , we are able to establish the relation stated. 

Suppose now that we define the amplitudes C„ • • •, C*.. more precisely 
by the aid of this especially prepared equation (3). Following a natural 
modification of the argument made in the preceding section we are at once 
led to infer the following additional result: 

In the case of first, and thus second, order stability (n — 2s) the s indi¬ 
vidual amplitudes C„ • • •, C. may be defined by means of prepared equations 
(3) in which 4><, contain no second degree terms. With this definition, these 
amplitudes change only slightly in comparison with the total amplitude during 
a time interval \ t — to | = 

We turn next to the general case of odd order n — 2s -f- 1 for the system 
of equations (1), and propose to briefly indicate some analogous results. 

In this case, besides the 2s equations like (3), we have a last equation 


dxjdt — X 


where X begins with second degree terms or higher. We attempt to prepare 
further these equations by a transformation 

»7i — rji + Oit Xn — xn + H (t—s), 


where F it Gi, H are homogeneous polynomials of the second degree. We find 
precisely as before that we can make *« ,2> — 4'« ,2) — 0 by a proper choice of 
Fu while the second degree terms in X have the form 


x <*> 


- 2 p, V 

/=» 




m 


dH_\ 

) 


If we take typical corresponding terms of the second degree in X (2) and H, 
both with exponents *„•••, m„ r in &,•••, *• respectively, and with 

coefficients c and q, the new analogous coefficient is 


c— CM** — + ■ • * + M** — m «)l V—i q- 


312 



Birkhoff: Stability and the Equations of Dynamics. 19 

In the case at hand the only possibility that the coefficient of q vanishes is 
that some U — m< = 1 while the other lj, mj must then vanish. Thus the 
terms €irji and x n 2 in X <2> cannot be removed by this method. 

In consequence of this failure in removing certain second degree terms 
it seems probable in advance that for the case n — 2 s -f- 1 > first order stability 
does not imply that of the second order. 

A simple example can be devised to show the truth of such a surmise. 
Take n = 3 and write the equations 

dUdt — (p l V = T+ «»•>*., drjjdt — (— p x V=T+ cxS) Vl , 

dxjdt — e$ X 7n (ce > 0), 


containing one such non-removable term in the last equation. 

On writing u =* £, 17 ,, v — x 3 so that u Q -f- tv — we obtain immediately 


du/dt — 2cv 3 u, dv/dt — eu. 


whence 


d 2 v/dt 7 — 2cv* dv/dt. 

This is immediately integrable, and yields 

<-/.-3/2 C J ro 

Suppose now that we take v 0 — 0 and substitute 


We find 


/ 3eu 0 \* 

V -{-2r) 

t-to— (3/2 c) f* 

o 


dw 

t/> 3 +l 


Taking the upper limit w — -f- 00 we conclude that the corresponding 
interval of time is of order since the integral converges. Thus v does 
not remain finite in an interval of order c~ 4/3 , and property (A) does not hold 
for m — 2. Hence in this special example the presence of non-removable 
terms is associated with instability of the second order. 

We proceed now to prove that (B) can only hold if there are no such 
noc-removable terms in the completely prepared form. Here we use once 
more the lemma on trigonometric sums of the preceding section. 

Let 

£< = «*«* + «*&** + • • ‘ , 171 — € Vi * -f € 2 TJi** + • • -, 

Xn — €X„* -f €*x„** -f- • • • 


313 



20 Birkhoff: Stability and the Equations of Dynamics. 

be that solution of the prepared equations for which 

= at ,0, €, Vl = bt (0, €, x m — c ,0, e (t — <o). 

Here ai ,0> and 6 4 <0> are conjugate imaginary quantities, and c (0) real, with 

o x ,0> 6, <#> + • • • + a/ 0, b. t0> + c i0)2 = 1. 

Also $i* y rji*, x«* arc solutions of the equations of variation 

£ t * =, rjS _ , Xn * — C <0 *. 

We consider now the second terms in these series. 

By differentiating the prepared equations twice as to c and setting < — 0, 

we find 

^L** - P ,V=U.», ^ - —j>iV—-Til**. 

H-+ 

at 

where the terms on the right in the last equation arise from the non-removable 
terms of X. Furthermore for t — *,** - vS* - *"** - 0, since, for ex¬ 

ample, *i** is the second partial derivative of £i as to «, and $, is constantly 
equal to a i (0, « at t — to. 

Thus we infer that $i**, are identically zero, while from the last 
equation 

x,,** — [cifli <0 '6i‘ 0> + * • * -f c.a/o'b/ 0 ' + 6c<°» 3 ] (t — t 0 ). 

But Tn can be approximated by a trigonometric sum of the stated type to the 
third order in «, by hypothesis. Thus (r N — c‘ 0 , «)/c 3 can be similarly approxi¬ 
mated to the first order in <. Consequently the limit of this quotient, x„**, 
is such a sum by the lemma. This can only be a trigonometric sum for all 
choices of b t ' 0 ', c t0> if a — d — 0 (i — 1 , • • •, s), which we wished to 
prove. 

In the general case n = 2s -f-1 a necessary and sufficient condition for 
second order stability is that a prepared form (3) exists with a supplementary 
equation 

dx„/dt — X, 

where •*»«, X begin with terms of at least the third degree. Thence 
follow results concerning the representation by trigonometric sums, and con¬ 
cerning the variation of the s -f- 1 suitably defined amplitudes, which are 
entirely analogous to the results in the case n = 2s. 


314 



Birkhoff: Stability and the Equations of Dynamics. 21 

In fact, we have reached a prepared form in the case n «= 2s + 1 w ith 
same characteristics as that obtained in the case n = 2s earlier, and can 
apply the same arguments with obvious modifications. 


§ 5. Third and Higher Order Stability. 

We propose now to assume stability of the third order, taking n — 2s 
first, and then, by making use of condition (B) only, to obtain a necessary 
prepared form of equations, which can readily be proved sufficient for stabil¬ 
ity of the third order. An entirely analogous treatment can be made for 

n — 2s+ 1 . 

By passing thus from the case of second to third order stability as well 
as from the first to the second, the general results can be formulated. 

Starting with the prepared form (3) we now attempt to remove as many 
of the third order terms as possible by writing 

— ll+Z’l, r/i^rji + Gt (i—1, •••,$) 

«■ • 

where now F it G t are homogeneous polynomials of the third degree in 
* * ' > V- By exactly the same formal work as in our treatment of second 
order stability, it turns out that all such terms may be removed except for 
terms of the form Ui$i, V t rji in the tth pair of equations, where Uf, Vi are 
linear combinations of the $ products $ lVl , • • •, $, v .. Thus we have 

dii/dt — (p { V=T + Ui)t t + • • •, 

drji/dt — (— Vi ) Vi +• • • 

where the terms not explicitly written are at least of the fourth degree. From 
these equations we derive 


d(ttvi) 


dt 


— (* 7 < + + - • * 



•> s), 


where the omitted terms are at least of the fifth degree. 

Now we expand Vl in power series in « as before. To determine the 
coefficients in this series we proceed precisely as in the case of second order 
stability and find series in c which we write for t — 1, • • • t $ f 


it = v-i-f * -L . . y 

V* = € * Vi ** -f- • • • . 


315 



22 Birkhoff: Stability and the Equations of Dynamics. 

In virtue of the hypothesis of third order stability and the use of our lemma 
on trigonometric sums, we conclude that «£«*, where 

«W 0> &« <0> + « 4 *«• + • • * * 


are trigonometric sums. 

If now we substitute these different series in the equations above in <*»,, 
and compare terms in « 4 we find simply 

d+S/dt-lUW'bS”, •••,) + ViWW * * )] a ‘ <0,6 « <0> - 

Consequently unless C7« + V* vanishes identically for < — 1, - “ , 5, the 
functions <*> 4 * will not be trigonometric sums but will be linear functions of t, 
which is not possible. 

This gives us the necessary form of prepared equations in the case of 
third order stability, n — 2 s, 

(5) j dh/dt — Mih + *«* 

I dyi/dt — — Mtyi ■+■ (i — • * ’» s )» 

where Mi has a constant term ptV^T and linear terms in the s products 
IxVi, • • •, l,v„ and where 4»«, begin with of at least fourth degree. 

Our first general conclusion, based on a step-by-step process to higher 
and higher orders, is thus the following. 

For n — 2s a necessary condition that (B) may hold for any odd order 
m of stability is that the equations may be transformed to the form (5) in 
which the functions Mi are polynomials in the s p roduc ts l lVl , • • •, l.rj. of 
degree £(m — 1) at most , with initial terms V—1/>«* having pure 
imaginary coefficients throughout ,* and where 4><, commence with terms of 
degree at least m + 1. In this case <*>«, may be made to commence with 
terms of degree m -f- 2 at lowest without further assumption. 

To show that this form (5) induces (m + l)st order stability (m odd) 
we obtain first of course 

I Ww) I ^ Xu"** 3 , 
dt 


where u 2 =• hrji + * • * + and thence 


I — 

I dt 


^ $sXu"** 2 . 


• These are pure imaginary quantities since the variables are self-conjugate, 

and M i t l must be conjugate to —If 4 V 


316 



Birkhoff: Stability and the Equations of Dynamics. 23 

Here X is a fixed constant. From this inequality we infer that u remains of 
the first order in e for the interval | t — U | ^ by the same method 

already employed for m == 1. 

This gives us the property (A) as a consequence of the prepared form. 
Now we deduce also from the same inequality 

| $ir,i — I ^ I *— *® I 

in this same interval. On going back to the prepared form this yields at 
once 

| dU/dt — | ^ Ki m * 3 \t —to \+ Nt 

in which Mi (0> designates what 21 1 becomes when £<, yt are replaced by £i <0> , 
rfi (0> . Hence we infer 

| d($ i e~»* <n “- l *')/dt | ^ Kt- 3 | t — 1 0 I -f 

and finally 

| <i — f,«»e*| £ iT<~* (< ~*** * + A r «"" 8 I t — to | 

in an interval of the order of 

Hence we infer in passing that property (B) holds as well, and obtain 
additional properties of the approximate trigonometric representation. 

The necessary condition of the prepared form (5) insures stability in the 
complete sense that (A) and (B) hold, and so is sufficient. The explicit 
solution of the prepared form with terms < 1 ><, omitted, and with initial 
values rji (0) chosen to correspond to the initial values x 1 ,0) , • • •, x M <0) of 
x Xi • • •, x n yields trigonometric sums by means of the composite transforma¬ 
tion which represent x« with an error not exceeding 

K.—» + 1 V,—* | t— < 0 | 

during a time interval 

| t — t 0 | ^ 

where t denotes the initial distance from the origin and where K, L, N are 
positive constants. 

If we define the amplitudes C„ • • *, C, of a solution by means of the 
prepared form it is clear that we can establish the following result: 

The amplitudes C lf • • •, C, vary slightly in comparison with the total 


317 



24 Bibkhoff: Stability and the Equations of Dynamics. 

initial amplitude c during an interval | t —1 0 | ^ L**-*— 1 *, L* being a suit¬ 
ably chosen positive constant. 

There remains now to refer briefly to the case n odd. It is only the 
handling of the last equation in x n that needs to be referred to. In passing 
from the second order case to the third it is seen at once that the third order 
terms in this equation may be entirely removed without any assumption con¬ 
cerning third order stability. The third order terms in the other equations 
can be treated as in the case n even. 

In the case of mth order stability, n — 2s + 1, the prepared form is 

(6) dii/dt — Miii + *«, dm/dt— — + *i, (i — 1, • • *, s) 

dx H /dt — X, 

where Mi are functions of the products £1171 and of x n of degree m at most in 
$X, * • - ,V; Xn, and X have no terms of degree lower than m + 1. 

From this form, the properties (A) and (B) , suitable trigonometric sums, and 
the slow variation of the amplitudes follow just as in the case n — 2s. 


§ 6. Formal Theory of Complete Stability. 

In this section we shall consider equations (1) of even order n — 2s, 
and for convenience we shall write the equations in the form 

(7) d$i/dt — 4<(£i, a * ’> V»)> drj t /dt — * * •, v$) 

(<-v • •.»•> 

where the conjugate power series are without constant terms and where 

the first degree terms in <X>i and *1 are V— 1 and — pi V— 1 respectively. 
We suppose furthermore that we are confronted by the general case as de¬ 
fined in § 3. 

In other words, of all possible equations (1) we are dealing with those 
typical of first order stability. 

It is obvious that any formal power series transformation 

(8) £< — rn=*vi + G t (i= 1, ' • •, s), 

with Fi and G 1 conjugate series in £<, rj t , carries the system into another 
formal system of the same type, in which 4><, Vi may not be convergent series, 
if Ft, Gi are not. It is apparent that we have here a formal group leaving 


318 



Birkhoff: Stability and the Equations of Dynamics. 


25 


(7) invariant, which will be perfectly consistent with itself in the sense that 
the result of carrying out two successive transformations is the same as that 
derived by the composite transformation.* 

Now as already proved we can remove the second degree terms in 'l'* 
by using F x , G x which are homogeneous quadratic polj'nomials. Following 
this with a transformation in which Ft, G x are cubic, we can remove all terms 
of the third degree save those in 4 >< and '& l of the respective types C7<£< and 
Vi£ t where Ut, V { are linear in the s variables & 17 ,, • • •, All of this 

goes just as in our discussion of stability in the earlier sections, and we 
arrive at the following first conclusion by an infinite succession of steps. 

f 

Under the formal group the equations (7) may be given the form 
( 9 ) d$i/dt — Udrji/dt V t rji (i — 1 , • • •, s), 

where U\,V X are power series in the s products • • •, • 

This result is extremely interesting from the formal point of view for the 
following reason: 


If we write ui — £171 then we have the formal equations in w,, • • ■, u, 

(10) du,/dt — [U,(u„ • • •) + r ( ( Ul , • • •)]«, (i_ 1, 

thus yielding an associated formal differential system of one half the order of 
the original system (7). 


We propose next to ask what is the most general transformation of the 
group ( 8 ) carrying the normal form (91 into the normal form. Let us 
consider terms of the second order in F,, 6, which alone can modify those of 
the second order in V,, V,. According to the discussion carried out in the 
treatment of second order stability, these are uniquely determined by the 
initial and final forms of (9). But in neither are there terms of the second 
degree. Hence F <, G x lack second degree terms. 

Pass next to third degree terms in U { , V, which are only affected by the 
erms of equal or less degree in F„ 6,. Here we find as in dealing with 
third order stability that since £7,f, and have only terms of the third 
degree which involve a factor (, and „ together with a product fl „ linearly, 
F„ 0, contain only terms of the third degree of the same character. Thus 
we may continue indefinitely, and we infer 


*? - 8 ,*”* ° bS,!rV ; d ,h ‘ S gr ° Up * dmi,s of a sli * h ‘ ^ tension in that the initial 

^ “ r r «, ,0r f ‘'’‘ " ight * rePl * M<l -P~“ve.y <c,d, being 


319 



26 Birkhoff: Stability and the Equations of Dynamics. 


The most general transformation (8) which carries the normal form into 
itself has the form 

(11) — (i— 1, •••,5) 

where F t *, Gi* are series in the products ^,17,, * * *, ZmV with constant terms 1. 
The form of this transformation indicates that the family of formal ' surfaces ' 
Q( Ul , • • •, u,) =» const, is invariant under the group. 

The general invariant theory of equations (7) is less simple than that 
of the special class of equations possessing complete formal stability. 

A necessaYy and sufficient condition for complete stability of a system of 
equations (7) is that Ui + V« — 0 (i — 1, • • •, s) so that the equations take 
a normal form 

( 12 ) dti/dt.— Miii, drp/dt - M*qi (* —••’»«) 


where M lf • • •, M, are pure imaginary formal power series in the s products 
tiVb • * •, with initial terms p.V—1, ' * P«V—1 respectively. 

For it is precisely a result of the earlier sections that stability of the 
mth order yields this normal form to the terms of order m. Also the new 
transformations employed at successive stages (m — 1, 2, 3, • • • ) do not 
affect terms of lesser degree. Hence we obtain a limiting formal transforma¬ 
tion which takes the equations into the given form. Conversely, if there is 
such a form, the actual transformation obtained from the formal transforma¬ 
tion by breaking off at the term of (m + l)st degree will yield the same form 
of equations out to terms of the same degree so that the prepared form is 
actually obtained, and thus stability of any arbitrary order m is ensured. 

These series M 4 must be pure imaginary since the products $nj t involved 
are self-conjugate. 

For this system (12) we have the integrals £<171— const. Hence in the 
most general transformation (11) which keeps the normal form (12) unal¬ 
tered we have $oji = const, also. On this account it is possible to determine 
the transformed equations at once. We find 


dt « 

dt 


J_ d^ 

Pi * dt 


M*tt 
f \* 


= 21 it i 


with like equations in 


The most general transformation (11) leaves the normal equations (12) 


320 



Birkhoff: Stability and the Equations of Dynamics. 27 

unaffected except that in each Mi the products fat are replaced by $iyiFi*Gi*. 
Thus the subgroup for which Gi*=l/Fi* leaves the equations unaffected in 
every respect. 

The question naturally presents itself as to what are the invariants of the 
-equations of completely stable type. The reply is obviously the following: 

Reduce the given equations with the property of complete stability to the 
normal form (12), thus obtaining s associated series 

Afl(U|, • • •, Us), • • •, Ms(u t , • • • ,u t ). 

The invariants of the completely stable type of equations will be the invariants 
of these forms under arbitrary transformations 

Ui—Uifi (t— 1, • • •, s) 

* 

where the ft are arbitrary real power series except that they must have a 
constant term 1 . 

In this normal form the equations can be at once integrated precisely 
because ttyi — give s integrals for t— 1, • • *, s. 

The integrated form of (12) is 

(13) $i — rji — (<— 1,. 

When these expressions are substituted in the formal series giving x u • • • , x„ 
in terms of • • •, rj, the general formal trigonometric solution is obtained, 
involving • • •, rj, (0) as the n arbitrary constants of integration. 

The products £«igive real power series in the s variables x Xi • • •, x n 
evidently beginning with a quadratic form which is the product of two con¬ 
jugate linear factors. This result may be stated as follows: 

If the equations (1) have the property of complete stability there exist s 
first integrals <£< — c< (i — 1 , • • •, s), given by formal series beginning with 
a second degree term which is the product of two conjugate linear factors, 
forming in all n — 2s linearly independent linear expressions. 

Of course an arbitrary Q(u lf • • •,«,) — const, also yields a first integral, 
for any real power series Q. 

It remains to say a word about the case of odd order n — 2s -f- 1. It is 
the normal form that is particularly to be noted. This may be written 


321 



28 


Bibkhoff: Stability and the Equations of Dynamics. 


(12') d$i/dt — M,$ h drji/dt — — J/i v , t dx H /dt = 0 

(t=l,- • •, s). 

Here Mi is a power series in the s products • • •, $,ij, and in only. 
Thus the equations separate into a system of the type obtained in the case of 
even order together with an equation yielding x n —■ const, for the remaining 
variable. 

The normal form of equation for the completely stable case, n — 2s + 1, 
consists of the equations (12'). These equations display the slight difference 
between the odd and even order case. In the odd order case there exists a 
real formal series integral <f> — const. (corresponding to x n *= const.) con¬ 
taining a first degree term, as well as s integrals <f >i — const. 

§ 7. Variational Principles and the Equations of Dynamics. 

Let P„ • • •, P n , Q be n -f 1 real analytic functions (n — 2 s) of x u 
• • •, x n in a certain open connected continuum under consideration, and let 
C* be an arc of a curve with continuously turning tangent lying within this 
continuum given by 

** — *«•(<) (to^t^U) 

for t — 1 , • • •, n, where x { + are continuous functions of t with continuous 
derivatives. In this case the integral 

(14) / - f‘‘ {P.dz, + • • • + P.ix. + Qdt) 

J t„ 

has a certain numerical value /* along the curve C*. Now suppose that 

*<-*«•(<, A) (totkt^t x ) 

for »— 1 , • • •, n represents any arbitrary one-parameter family of such curves 
which gives C* for A — 0, and has the same end points as C* for all values 
of A. Suppose furthermore that the functions *i*(f,A) are restricted to have 
continuous partial derivatives as to t and A. A family 

*<_*<•(*) +Ate<(<) (*— 1, • * •, n) 

will fulfill these conditions if the Sx t are continuous together with their 
derivatives and vanish at t = t 0 and t — while A is taken sufficiently small. 
Under these circumstances I becomes a function /(A) when x t *(t. A) are sub¬ 
stituted for Xi under the sign of integration, and this function /(A) will have 
a continuous derivative as to A whose value for A = 0 we denote by SI. This 


322 



29 


Birkhoff: Stability and the Equations of Dynamics. 

variation SI may vanish identically along a curve C*, i. e., SI * = 0, and it 
turns out by direct inspection of the expression for SI that the condition is 
precisely that the coordinates x x m , • • •, x n * satisfy the set of n Pfaffin 
differential equations. 



All this amounts merely to the statement of some elementary principles 
in the calculus of variations as applied to a particular and very important 
type of integral. 

On account of the fact that the vanishing of SI along a curve in this 
sense is obviously invariant of the particular coordinate system, it follows that 
if we make a change of variables 

( 16 ) x t — ‘ •» *n) (» — 1, • • •, n), 


where the correspondence between x lf • • •, x H and x,, • • •, x n space is one- 
to-one and analytic along the curve, the new equations can be obtained by 
determining P lt • • •, p mt Q by substitution under the integral sign, and 
then writing down the new corresponding Pfaflian equations. But the formal 
relation so established can of course be stated and proved directly, without 
intervention of the principle of variation. 

Suppose now that we consider formal (i. e. convergent or divergent) 
series for P {> Q in the neighborhood of the origin in • • •, x n space, and 
also the corresponding formal equations (15), and suppose that we effect the 
formal transformation (16) in which <*., are given by similar series without 
constant terms for which the formal Jacobian has a constant term not zero. 
The new equations can be obtained by the same rule of transformation as 
before, and in general the result of more than one such transformation is of 
course the same as that for the composite formal transformation. These 
statements can be proved by first taking the actual equations obtained by 
breaking off the series for P„ Q at terms of high degree m, while actual trans¬ 
formations are obtained by breaking off the series for + t in the same way. 
By considering the relations involved as m becomes infinite, the stated formal 
results follow at once. 


I state these facts at some length 
this way the condition 5/—0 yields 
differential equations (15) subject to 
of actual convergence. 


since it is important to realize that in 
a perfectly definite system of formal 
the same formal rules as in the case 


323 



30 


Bibkhoff: Stability and the Equations of Dynamics. 


This point of view admits of a slight extension which will come into play 
in our first theorem. Suppose that P it Q are generalized to the respective 
forms PJD, Q/D, where P it Q, D are formal power series, but where D need 
not contain a constant term. However, nevertheless we obtain an extended 
Pfaffian system 



The equations obtained will be independent of any formal factor n intro¬ 
duced in all of the numerators and denominators of Pi/D, Q/D. Thus the 
formal significance of the condition SI =— 0 is maintained even in this case. 

If the equations (1) are of completely stable type at the origin, and if 
$i, rji, Ml (i— 1, • • •, s) are the series in x u • • •, x n which appear in the 
normal form (12), then the equations (1) may be expressed in the extended 
Pfaffian form 

(17) s P‘ { 

J t. I i-I ini >=» ; 

According to what has been said above, it is only necessary to verify that 
(17) leads to the equations (12) when £,, • • •, 17 , are taken as the dependent 
variables. This may be immediately verified if the conditions dMi — 0 
(t — 1 , • • •, s) are equivalent to d$ t 171 — 0 , as we shall take them to be. 
A still more formal attack is to change variables in (12), writing ui — Mi t 
v< — log £ 4/171 (i— 1 , • • •, s). The condition (17) becomes 

(17') 8 It u,'dt — £ t ,,'dt } -0, 

J u l /«1 /»1 9 

giving the desired equations 

dui/dt = 0 , dvi/dt — 2ui, (t = 1, • • •, s) 

if we grant the legitimacy of the formal processes involved. 

The equations given by (17') are of Hamiltonian form with m, v c as 

a 

conjugate variables and with principal function H —= 2 uf. Or, better still, 

we can remain within the non-extended type of Hamiltonian equations in case 
Mi , aside from its constant term which does not effectively enter in (17), 


324 



Bibkhoff: Stability and the Equations of Dynamics. 


31 


is divisible by $ iV i. In this case (H, — pt V— 1 )/im is admissible power 
series, and (17) may be modified to the equivalent form 


s f ‘ { 2 v~i - i U,vt } 

•y t 0 j=i >=* 

giving equations of the Hamiltonian form, if we take 

. /.V<—p.V—T \ 5 _ /-'/ i — Pi V~~l\ * 

(I8) fl „v=T > ) 


as conjugate variables and i2.V>*/V—1 as principal function. Also, this 

i=i 

choice of variables will lie in the original formal group in case Mi contains a 
term Ci$ivi, Ci ^ 0, of the first degree. Of course it can contain no other 
such terms in this case. These results may be summarized as follows. 

The equations (1 ) of completely stable type can be made to assume the 
Hamiltonian form, at least if we admit the legitimacy of certain extended for¬ 
mal transformations. In any case if Mi — pfV—1 is divisible by £ivi for 
i — 1, • • •, s, with linear term ct$ivi (cj^O) then the equations (1) can be 
made to take Hamiltonian form by means of (18), without any use of stick 
extended transformations. 


The last part of this result is obvious from the normal form (12) which 
becomes of the very special type 


dii/dt— (pi —1 $ tt dru/dt — (—pi — 6ii9<) V—1 vi 

(t-r-1, • • S) 

if use be made of the special variables ( 18 ). 

Consider now a Hamiltonian system (1) of general type with equilibrium 
point at the origin, which can be written 

dui dH dvi dH 

dt "" 9vt ’dt -alTT (i — 1, • • *, s) 

in which the lowest degree terms in H are 


V— 1 (piu t v t -f- • • • -f p,u,v,) 

where the numbers p { arise in the same way as for any equilibrium point with 
first order stability. The variables u h v t are conjugate imaginary variables 

of course so that a preliminary linear transformation of the real variables is 
presupposed. 


325 



32 


Bibkhoff: Stability and the Equations of Dynamics. 


It is well-known that the transformation for arbitrary K, 


Ui = 


dK 
dn 9 


Vi —■ 


dK 

dui 



preserves the Hamiltonian form, i. e. that the transformed equations are 
obtained by replacing ut, v 4 by their series in ui, vi in H to obtain H. We 
shall assume that 

K «= tijUx -f * • • + u,v , + • • • 


where the terms not explicitly written are of higher than the second degree. 
If K has only the first term, the transformation becomes the identity 


Ui—Ui, (i— 1, •••,$). 

These transformations form a group. . 

We propose first to take K of the third degree and thus get rid of the 
third degree terms in II; next to fake K as the sum of the first quadratic 
term and a homogeneous fourth degree term and thus eliminate all the fourth 
degree terms of H except those quadratic in the s products u,v t , • ■ •, u,v,; 
and so on. Thus by an indefinite sequence of such transformations we pro¬ 
pose to make II a function of the s products 11 , v lf • • •, u,v s . 

Now the first of these transformations is 


ITT ’ v, ~ v,+ l^ <<- 1 , 

in which the variables on the right of each equation are ii it v if and K * is a 
homogeneous cubic in these variables. If we express u ( , v t in terms of u t , Vi 
they take the form 


. , dR* , dR* 

u ‘“"‘ + wr +' aST +" •> 

where R * is K* with the variables fi h v t replaced by fi«, ? h and the complete 
first and second degree terms are explicitly wTitten. 

Consequently 77°‘ as obtained by the usual expansion in Taylor's for¬ 
mula is 


77«» _ 


2 PjV 

i-i 






dK* 

duj 


— v, 


ag» \ 

dvj ) 


where H (3> designates the third degree terms in II, and the dashes over the 
letters are omitted. 

Evidently then the homogeneous cubic polynomial K* can be so deter- 


326 



Birkhoff: Stability and the Equations of Dynamics. 


33 


mined, and uniquely, that all of the third degree terms disappear in H. In 
fact a term in K* 

qu 1 lt Vi m ‘ • • • u, l >v, m ' 

leads to a similar new term in H with coefficient 


— 9 [ 2 PjV— 1 (// —m/)J , 

/=i 

where the term in brackets is not zero for 


li + m « + ‘ * * + It + 3 . 


This is in accordance with our statement since there can be no third degree 
term in the products u x v u • • •, u,v,. 

Another transformation will lead to a modified fourth degree term in H 
of the same type. 


/ 7 ( «» _ 


i„V=T („£•-„ 


dK*\ 

dv, ) 


where K* is now the arbitrary homogeneous polynomial of the fourth degree 
at our disposal. By the aid of it we can eliminate all the fourth degree 
terms in //<«> except those for which l t — m if i. e. the very terms we do not 
attempt to remove. 

By continuing in this way, we are led to the conclusion: 


For the general Hamiltonian equations (1) with first order stability, U 
can be taken in the form II(u x v lf • • •, u.v.). Inasmuch as the Hamiltonian 
equations then take the form for i — 1 , • • • s , 


dui 

dt 


m 

dm 


Ui, 


dv ( 

■37 — — TIT Vi (* i — utv i ) 


JH 

dm 


of the completely prepared type ( 12 ), it is dear that first order stability 
necessitates complete stability in the case of the Hamiltonian equations. 

Of course it is well known that the Hamiltonian equations admit a com¬ 
plete formal trigonometric solution in the case of first order stability. 

We can now readily attack the question as to the possibility of reducing 
a completely stable system (1) to Hamiltonian form by means of a trans 
formation belonging to the restricted formal group. 

2 T ^ the PreCed ' ng Seetion ,hat the most genial transforma- 

121 I k f°? WhiCK keP ‘ PreS6rVed the c °“P> a ‘ely prepared form 

( ) as given by (11). It was furthermore seen that if,,--, if, are not 


327 



34 


Birkiioff: Stability and the Equations of Dynamics. 


altered, i. e. that Mi is obtained directly from Mi by substitution. But such 
a substitution changes the variables in M x , • ' M, from »„•••,*■« to 
7r lt • • •, 7T, in accordance with the formulas 

- wiF t *Gi*. 

Imagine now the particular transformation performed which reduces the 
completely stable prepared equations (1) to Hamiltonian form, if that be 
possible. Since M ty • • •, M, are not altered, there must obtain 

.V, = dH/di F ( , 

or, more briefly, 

dH — 2 Midwt. 

<=i 

A necessary condition that such a transformation to Hamiltonian form 
be possible is therefore that 


2 ‘ , it,) dirj 

i>i 

becomes an exact differential for suitable real power series 

wi — yri f { (w„ • • •, * $ ) (i — 1, • • •, *) 

where the constant terms Ci in f t are not zero. 

In particular this requires that the first degree terms in 

2 c f M,drr, 
i*i 

yield an exact differential. More precisely if this term in M t is 


we must have 


+ ‘ * * + 


<*i) _ c, 

an Ci 



which is not in general true for s > 2. 

The special case first pointed out when transformation to Hamiltonian 
form is possible, namely when M t — p«V—1 is divisible by falls under 
the type treated with dw t = dM t /y /— 1. 


The necessary condition 
Hamiltonian form. 

In fact choose F 4 *, Gi • 


stated is also sufficient for transformation to 
in (11) 60 that xi — inji, which is obviously 



Birkhoff: Stability and the Equations of Dynamics. 35 

possible in an infinite variety of ways. Then M it • • •, M$ are unaltered, and 
also 

Midi r, + • • • + M,d*. = dU. 

It follows that 

Ml — (i— 1, • • ' y s) 

which is what we desire to prove. 

It is apparent in this way that while completely stable systems are of 
equal generality with the Hamiltonian type in a very wide formal sense, they 
are more general if only transformations of restricted type are admitted. 

Wo. turn now to the consideration of the general Pfaffian type of equation 
derived from the non-specialized variation problem. 

At the equilibrium point the equations of variation of a Pfaffian system 
(15) are evidently of the form 

£ ( °" ~+ 0,1X1 ) - 0 

where a„ is skew-symmetric (i. e. a f/ — — «„) and is symmetric (i. e. 
Pit-Pn)- These are linear differential equations with constant coefficients, 
the nature of whose solutions depends on the roots of the determinant equation 
in A 

I «i/A j -f Pn | — 0. 


In the case of stability the roots must be zero or pure imaginary, of course, 
while the elementary divisions of this A matrix must be simple. We shall 
assume that there are no linear homogeneous combinations of these roots with 
integral coefficients which vanish. This is the general case. 

In order to obtain a preliminary prepared form for the Pfaffian equa¬ 
tions, we note first that the constant terms in P, and Q can be omitted since 
they do not enter into the differential equations. Moreover, if the origin is 
an equilibrium point, there are no linear terms in Q. According to the prin¬ 
ciple of variation, we may directly transform the part of the integral I con¬ 
taining the linear terms so as to obtain the linear terms with the new vari- 
aMes. But it is known that by linear transformation and omission of an 
exact differential, the linear terms in P, and P, become and where 

. a . ?* * re con J u g ate variables in the sense of the Hamiltonian equations, 
while simultaneously Q takes the form 

= + +• • •, 


329 



36 


Bibkhoff: Stability and the Equations of Dynamics. 


where only the lowest degree terms are written. Now, if we note that — Xfdx < 
differs from x t dx, by a perfect differential we can write the lowest degree terms 
under the integral sign as 

2 2 Xidxt — 2 P/(*< 2 + xf)dt 

where the index t runs over s values while j takes the conjugate index values. 
But these are essentially the first order terms in the Hamiltonian integral. 

At this stage we may introduce the conjugate variables $trjt by setting 

— Xi + V— 1 x h v i — *« — V— 1 x, , 

where we find the linear terms above take essentially the form 

2 tidvi* 

fi 

while 

• _ 

Q = 2 PiV—linH -• 

fl 

It is this preparation of the Pfaffian system in its lowest degree terms 
that we shall assume to be made, and we shall seek to reduce it further to 
complete Hamiltonian form by a succession of changes of variables 

— $i + V* “ V* + (i —1, • ' *,4) 

where Ft, Gt are conjugate series in $ l9 • • •, beginning with terms of at 

least the second degree. 

Let us write the prepared linear differential form as 

Ridit H-h R.dU + S l d ni + • • • + S.d v$ + Qdt, 

and let us seek to eliminate the terms of second degree in R lt • • •, R t by a 
change of variables of the above type in which Ft, 0t are arbitrary homo¬ 
geneous quadratic polynomials. 

We find immediately that the above linear form is preserved in character 
while we have 

- ft.”’ + £ f , 377 (< -1, • • •, s) 

if Ri«\ i?< <2 > denote the second degree terms in Rt and R t respectively. Evi¬ 
dently, by the subtraction of an exact differential, we can replace the sums 
on the right by the more abbreviated expressions — Ot. 



Bibkhoff: Stability and the Equations of Dynamics. 


37 


Hence there exists a set of Oi which removes the remaining terms in 
Ri (2) , and it is clear at the same time that in this way higher and higher 
degree terms in R t , St can be disposed of by taking F<, as homogeneous 
polynomials of corresponding higher and higher degree. Thus a formal limit¬ 
ing composite transformation is set up which reduces R lt • • •, R, to zero, while 
faking Si> * * ', S, into • • ■, $. respectively, as desired. 

In the general case of first order stability the Pfaffian equations can be 
transformed formally to Hamiltonian form. 

§ 8. Questions of Convergence. 

The methods which we have been following are based on the use of certain 
series. Their convergence or divergence is immaterial for the properties which 
we have so far discussed. Nevertheless the questions of convergence are of 
great importance for the theoretic treatment of a given dynamical problem. 

For example, the Hamiltonian and Pfaffian types of equations (1) admit 
of the convergent series integral H — const, and P —const, respectively, 
whence in the case of first order stability we can immediately infer the per¬ 
manent complete stability in the sense (A). Or again, there may exist a 
convergent series P for which 

9/dx t (PX t ) + • • • + d/dx n (PX n ) — 0 

in which case / Pdv is an invariant volume integral ( d v, element of volume); 
from the existence of this invariant volume integral, various important prop¬ 
erties of recurrence can be deduced. 

In the most favorable circumstances the normal form of the equations 
( 18 ) can be obtained by ordinary transformations defined by convergent 
series. Here the series M, and the trigonometric series solutions of the ori¬ 
ginal equations will converge. The precise character of the motion can then 

be specified for all time. Concerning this integrable case we prove the follow- 
ing simple result: 

For integrability of the system (1 ) it suffices that the formal integrals 

•h = f«,i — const. (i — 1 , • • •, s) 

are given by convergent series. 

In fact suppose that these real series do converge. Since they are form¬ 
ally factorable into conjugate factors f ( „, the separate components are given 


331 



38 Bibkhoff: Stability and the Equations of Dynamics. 

by convergent series of course. This follows from Weierstrass’s preparation 
theorem. Hence 

m<- r d 4r = i/f. <i - 1 , • • •, •> 

dt dxj 

are series given as a sum of formal quotients of two convergent series and so 
themselves are convergent. This demonstrates the complete integrability of 
(1) in case the series converge. 

The existence of ‘small divisors 9 in the series £<, rji shows that in general 
there will be no such convergent series <£«, and the series appear to be 
divergent for the same reason. 

The possibilities of classification of completely stable systems (1) on 
the basis of diverse convergence properties are limitless. Every system of 
series or of differential equations which may be attached to (1) so as to have 
invariantive significance under the general formal group evidently affords a 
means of classification of systems (1) in which convergence or divergence 
plays a role. 

It is with these few remarks that we terminate our reference to converg¬ 
ence questions. While on the purely formal side any completely stable system 
of equations (1) is to be regarded as equivalent to the extended Pfaffian or 
Hamiltonian type derived from a variational principle, it is clear that these 
more special types possess at least one ordinary integral and other convergence 
properties. In this non-formal direction the complete characterization of the 
completely stable equations (1) which may be reduced to the classical types 
still remains to be made. 


332 



Reprinted from Acta Mathematical October, 1927, Vol. 50, pp. 359-379. 


ON THE PERIODIC MOTIONS OF DYNAMICAL SYSTEMS. 

)»y 

GEORGE D. BIRKHOFF 

of t'AMBRIDOK, U. S. A. 

1. Introduction. 

In bis work on dynamics Poincard wus led to focus attention primarily 
upon the periodic motions. He conjectured that any motion of a dynamical 
system might be approximated by means of those of periodic type. i.e. that the 
periodic motions would be found to be densely distributed among all possible 
motions; and it became a task of the first order of importance for him to determine 
what the actual distribution of the periodic motions was. so ns to prove or 
disprove his conjecture. 

Poincarl employed the method of analytic continuation in his great Prize 
Memoir in the Acta Mathematica, which dealt with the problem 'of »i bodies. In 
the integrable limiting case when the masses of all but one of the bodies vanish, 
there are infinitely many periodic motions. By varying certain parameters he 
passed from this trivial limiting case to the case when none of the masses are zero, and 
showed that these periodic motions persist as members of analytic families, unless 
two of them combine and disappear from the real domain like the roots of 
algebraic equations with real coefficients. He did not consider the possibility 
of disappearance of such a motion by its period becoming infinite, although this 
possibility requires consideration also. 

Unfortunately this method of analytic continuation gave very meagre re¬ 
sults, for the following reason. Although there are infinitely many periodic 
families, it is conceivable that the range of the parameters becomes less and 
less as the type of the periodic motion becomes more and more complicated. 


333 



360 


George D. Birkhoff. 


This would mean that only a finite number of the periodic motions might exist 
for any particular set of the values of the parameters other than that of the 
trivial integrable case. 

Thus Poincar6 found the method of analytic continuation to be insufficient, 
and was forced to seek other instruments of attack. To begin with, he fastened 
attention mainly upon the simplest possible case with two degrees of freedom, 
namely the so-called restricted problem of three bodies. Almost all of the quali- 
tative reasoning in his Mithodes nouvelles de la Micanique celeste deals only 
with this case. 

Notwithstanding this severe limitation, and despite many years of effort, 
Poincare was not able fully to attain his goal. Near the end of his life he 
gave out his last geometric theorem without complete proof. 1 By its means he 
showed that in the restricted problem of three bodies and analogous problems, 
an infinite number of periodic motions would exist. In a recent paper* I have 
generalized this theorem and my earlier proof of it, although without giving 
the dynamical application. At the kind invitation of Professor Mittag-Leffler, I 
endeavour to set forth here, with as little technicality as possible, the essential 
facts known to me concerning the distribution of the periodic motions, particularly 
as based on an application of the geometric theorem of Poincar6 and its gene¬ 
ralization. 


2. The billiard ball on a convex table. 

In order to see how the theorem of Poincar£ and its generalization can be 
applied to dynamical systems with two degrees of freedom, I propose to draw 
attention to a special but highly typical system of this sort, namely that afforded 
by the motion of a billiard baU upon a convex billiard table (Fig. l). This example 
is very illuminating for the following reason: Any dynamical system with two 
degrees of freedom is isomorphic with the motion of a particle on a smooth 
surface rotating uniformly about a fixed axis and carrying a conservative field 
of force with it. 8 In particular if the surface is not rotating and if the field 
of force is lacking, the paths of the particles will be geodesics. If the surface 

• Sur un thioreme de Geometrie. fcsdlwH del CM Matematicn di Palermo. wL 33. «9«2. 

* An Exlention of Poincare* Last Geometric Theorem, Act* Mathematics. vol. 47. *926. 

» See my paper. . Dynamical Systems With Tiro Degrees of Freedom *, Transactions of the 
American Mathematical Society, vol, l8. I9«7- It is assumed that the Ln K ron»eiBn principal fane 
tion L is quadratic in the velocities. 


334 



On the periodic motions of dynamical systems. 


361 


j 8 conceived of as convex to begin with and then gradually to be flattened to the 
form of a plane convex curve C, the *billiard ball* problem results. But in this 
problem the formal side, usually so formidable in dynamics, almost completely 
disappears, and only the interesting qualitative questions need to be considered. 
If C happens to be an ellipse an integrable system results, namely as a limiting 
case of the geodesics on an ellipsoid treated by Jacobi. 

In this problem one can arrive at the existence of certain periodic motions 
by direct maximum-minimum methods. As of interest in itself I wish to show 
how this can be done. Results which are being obtained by Morse (but not yet 
published) indicate that the scope of these methods, already developed to some 
extent by Hadaniard, Poincar6, Whittaker and myself, can be further extended. 



Fig. 1. 


Thus, the power of such maximum-minimum considerations in the billiard ball 
problem is likely to prove typical of the general case. 

Any longest chord of the curve C (or boundary of the billiard table) when 
traversed in both directions evidently yields one of the simplest periodic motions. 
The billiard ball moving along this chord strikes the curved boundary at right 
angles and recoils along it in the opposite direction. If we seek to vary this 
chord continuously, while diminishing its length as little as possible, so as finally 
to interchange its two ends, there will be an intermediate position of least length 
which will be the chord C where C is of least breadth. Detailed computation 
of the slightly perturbed motions indicates that the first of these two periodic 
motions is unstable, while the second is stable, i.e. with formal trigonometric 
series for the perturbations. 

Next we ask for the triangle of maximum length inscribed in C. Evidently 
at least < one such triangle will exist, and can have no degenerate side of zero 
40 — 26404. Ada mathtmalico. 60. Imprint le 27 oetobro 1927. 


335 


362 


George I). Birkhoff. 


length. At each of its vertices the tangent will, of course, make equal angles 
with the two sides passing through the vertex. Hence a harmonic triangle is 
obtained which will correspond to two distinct motions, one for each of the two 
possible senses of description. 

Moreover if we seek to vary this triangle continuously, without changing 
the order of its vertices and diminishing the perimeter as little as possible, so 
us finally to advance the vertices cyclically, we discover a second harmonic tri- 
angle, also corresponding to two periodic motions. 

In this way the existence of two harmonic n sided polygons which make 

k circuits of the curve C (i less than " and prime to .,) can be proved. The two 

motions corresponding to the polygon of maximum type will be unstable, while 
the other of minimax type may be stable or unstable. 

In the case of a circular boundary the totality of regular inscribed polygons 

(simple or cross) form the harmonic polygons. 

We propose next to set up a ring transformation associated with the billiard 
ball problem, and to show how the geometric theorem of PoincarS in its first 
form leads to the facts deduced above. The reduction to a ring transformation 
is of fundamental theoretic importance, quite aside from the relation to the 
question of periodic motions. It should be noted also that in the cases of most 
interest like the restricted problem of three bodies' the method of reduction to 
a ring transformation and application of the theorem of Poincare is available 
for the treatment of the periodic motions, while the method of maximum-mini¬ 
mum has not as yet been shown to be applicable. 

3. Reduction to a ring transformation T. 

To begin with we suppose the length of C to be 2 rr and to be measured 
from a fixed point O to a variable point P by an angular coordinate <p (Fig. 2). 

At /'. taken as the point of projection of the billiard ball, let 0 denote the 
angle between the positive direction of the tangent and the direction of projection. 
The variable 0 varies between o and .t only. These coordinates 0,g> suffice to 
represent all possible states of projection unambiguously. If (f be taken as an 
angular coordinate in the plane, while ft. augmented by a constant, say n , he 
taken as ;< radial coordi nate, the set of values 0 , g> are repr esented on a ring 

• Si c my paper On RatrMal Problem of Three Bodies, BraAfeontl .lei Cirwlo Mate- 
■nali.n ... Palermo. v»l. 39- *9*5- lh « P a l M ' r ,,f P " ilK * f * ri,e ' 1 a,M,VP - 



On the periodic motions of dynamical systems. 


363 


bounded by concentric circles of radius rr and 2 n respectively, namely the circles 
0 = 0 and 0 = rr (Fig. 3 ). 

Consider now a definite state of projection at P with given 0, cp. The 
billiard ball leaves the edge at P to strike it again at P l% there to be projected 
in a state 0 l% q> ,, say, and so forth indefinitely. If C is an analytic curve, as we 
assume it to be, the correspondence between 0, q and 0 ,, q>, is evidently one-to- 
one and analytic within the ring. When 0 is nearly o or rr, the ball is projected 
at a slight angle to the edge, and strikes it again at a nearby point with 0 
nearly o or rr as the case may be. Hence the points on the bounding circles 
correspond to themselves with 0 , = 0, q> x = q>. 

One further remark needs to be made about the correspondence along the 
two boundaries of the ring. If we think of each point ( 0 , q) as being carried 

o 

Ki*. 2 . 



into [0 l ,q> ,) by a transformation or deformation of the ring, this transformation 
T will effect a certain number of complete rotations of the inner circle, and 
also of the outer circle, since the points of these boundaries are invariant as 
just seen. We may arbitrarily regard the inner circle as having undergone no 
rotation, but the same will not then be true of the outer circle which can at 
once be shown to have undergone a single complete revolution in the positive 
sense. For let the projection angle 0 for a given point P with corresponding 
fixed q> vary from o to rr. It is obvious that then 0, will increase from o to rr 
while q> increases by 2 rr since the point /', makes a complete circuit of C in a 
positive sense. In other words, the transformation T takes radial segments 
across the ring into curves starting at the same point of the inner circle but 
winding around the ring just once while crossing it. Hence the outer boundary 
has undergone a single positive revolution under the transformation T. 

Suppose now that we have a periodic motion, for example that correspond¬ 
ing to one of the harmonic triangles taken in a positive sense. It is evident 
that the transformation T of the ring takes the point of the ring representing 
the state of projection at the first vertex into that of the second; and likewise 
takes the state for the second vertex into that for the third, and that for the 


337 



364 


George D. Birkhoff. 


third vertex into the first. Thus when T is applied, the triple of points on the 
ring is cyclically advanced, and each point of the triple is unaltered by the 
application of the third iterate T s of T. 

Conversely to any triple with this property, or to any point invariant under 
r* together with its images under T and T *, corresponds a motion belonging 
to a harmonic triangle. Evidently then from considerations advanced earlier 
there are at least four such triples. 

It is obvious that there can be no invariant points under T itself, because 
rp is increased but by less than 2 n. 



in this way the search for harmonic polygons and the allied periodic mo¬ 
tions in the billiard ball problem resolves itself into the determination of sets 
of distinct points P,,... K cyclically advanced by T, so that in general wo have 

I'M)-Pi (« = I.---wi¬ 
lt could be shown more generally that each and every interesting property 
of the motion of the billiard ball is mirronsl in a corresponding property of the 
transformation 7\ Thus the dynamical problem is effectively reduced to that of 
a particular transformation of a circular ring into itself. 


4. The invariHiit integral. 

There is a further property of the transformation 

T: *-./'(«, '/>).* ft (0, 

which plays a fundamental part in applying the geometric theorem of Poincare: 
the double integral J J sin Odthhp taken over any area a of the ring has the 
same value as over the images under '/’ and its iterates. 



On the periodic motions of dymamical systems. 4,o:> 

Before passing- to the entirely elementary proof of this fact, one immediate 
conclusion may be cited in justification of the statement as to the fundamental 
theoretic importance of the ring transformation which was made earlier. Since 
the integrals evaluated over a,a x ,a„... have the same value, and since its value 
over the entire ring is finite, being 4 *. some two of the images a t and a, over¬ 
lap. Employing the inverse transformation we infer that < 7,-1 and oj -1 also over¬ 
lap, and thus finally that a,-* and a overlap (i >j). But, interpreted for the 
billiard bail problem, this means that the ball can be projected very nearly, with 
arbitrary position and direction, to return subsequently to nearly the same posi¬ 
tion and direction. As elaborated by Poincare, this chain of reasoning leads to 
the conclusion that the -probability* is unity for an arbitrary motion to return 
infinitely often to the neighborhood of its initial state. He called this property 
of the dynamical system * stability in the sense of Poisson*. 

The proof that the double integral is invariant depends on an explicit eva¬ 
luation of the Jacobian 

_ OJ^ j Otpi _ OOj dtp 1 . 

' “ 00 0<p dtp 0 O' 


In fact if 



M [0, tp) d 0 tl <p is invariant, we have 



M(0,, tp,)tiO, tl Pi 


"1 





where the variables 0,, tp x range over the region a, just ns 0, tp do over a. But 
according to the fundamental theorem for change of variables, T gives the in¬ 
tegral on the left the form 



M(0 X , tp x )JdOdtp. 


Comparing this expression and the integral on the right which are both integrals 
over the same arbitrary region a we deduce the functional relation: 

M(0. T) 


as the well known necessary and also sufficient condition for invariance. Hence 
to establish that J j si \\Otl0iltp is invariant we must only prove 



366 


George D. Birkhoff. 


sinO 
sin 0 , 

Let 

x = F(y), y = G (<p) 

be the equations of C in rectangular coordinates, so that, if r denotes the angle 
between the positive tangential direction at a point of C and the positive x axis, 
we have 



Fig. 4. 


Similar}' let r, denote the like angle at the transformed point, which will be given 
by the same expression save that tp is replaced by <p x . Finally let a designate 
the ajigle between the positively directed x axis and the direction of initial pro¬ 
jection (Fig. 4 ). It is evident that the following two relations will hold 

0 = a — t, 0 , = x, — a. 

Substituting in the above value for t and the analogous value for t,. and also 
substituting in for a the value 

♦an- 1 G ~ G M 
**(?,) “ *>> 


340 




On the periodic motions of dynamical systems, 
evident by inspection, we obtain the explicit formulas 


367 


T: 


B = tan- eleiklG (»>> _ tan-. ijjM . L (,. y,). 

* (Px) ~ i* Up) * (?>) 

t, = tan- - tan- - -V<»>. 7.) • 

' /• (ri) 7 ‘ — * M 


These two equations define the transformation T from (0. y>) to (0, ,q> t \ 
Taking- differentials we find 


(10 = L, r (I rp + L v , (l <p t , (10, = j!/, '/ y + M, r , (1 <p. 


whence at once 


it 
Ar. 


- 0 — {'* (/ (p . 


' lrP '~W 

This gives us the Jacobian 

_ _ j»/ t = _ \fm - F(<p)\ tr { 9 )-i(i( 9 l )-(i u f )) r Up) 

L r , \F( Pl ) - *'(<p)\ TVTt 7) - JfS (y.) - (l M) (</>,) 

But /’(y,)— F(tp), (•(•pi) — O Up) are proportional to cos a, sin a respectively, 
while we have also 

F* ((p) «■ cos *, (S' Up) ■=* sin t, / *' (y,) «* cos r,, 6*' (y,) * sin t, , 


so that finally we obtain 


as was stated. 


sin ( a — *) sin0 

sirii/, a) sin 0, 


5. Application of the theorem of Poincarl. 

As has been seen, there are no points of the ring which are invariant un¬ 
der T. On the other hand consider T• followed by a rotation of the 0. <p plane 
through an angle —2 n which we designate by 7?—,. The resultant transformation 
of the ring admits the same area integral as T, of course, but advances the points 
of the outer circle by an angle 2 n, and those of the inner circle by an angle 


341 



368 


George D. Birkhoff. 


—2 n of opposite sign. These are the two conditions essential for the application 
of the theorem which states that any one-to-one transformation of a ring, ad¬ 
mitting an invariant integral and rotating the boundaries in opposite angular 
directions, possesses at least two distinct invariant points with indices of opposite 
signs. Hence R- x T* (the compound transformation) possesses two such invariant 
points. This means that 7** has two geometrically distinct invariant points of 
oppositely signed indices*, although these correspond to an increase of 2 n for rp. 

If P is such an invariant point, so is T(P) of course but with the same 
index. Thus we get two point pairs, say 

p. t(p)\ q, jm 

all four distinct. These evidently correspond to the two fundamental periodic 
motions. 

For the application of the theorem of Poincar^ to the periodic motions of 
more complicated type it is necessary to take account of the fact that every such 
motion is associated with a distinct second such motion obtained by reversing the 
direction of motion, although these motions have the same index. However, one 
of these motions increases q> by 2 k ,-r while the other increases it by 2 (>i — k) ,-r. By 

only considering invariant points of T n (n > 2 ) for which <p increases by 2 krt, A-£ ~ 

we clearly obtain each harmonic 11 sided polygon only once. It may be noted 
in passing that this pairing of motions in the billiard ball problem is fully re¬ 
flected in the fact that T is a product of two involutory transformations: it was 
the same special property of the ring transformation in the restricted problem of 
three bodies which enabled me to prove the existance of infinitely many symme ¬ 
tric periodic orbits.* 

Now turn to the invariant points of the compound transformation T" 
where /?* denotes a k fold rotation through the angle —2 ,-r. The rotations on the 
outer and inner circles are clearly 

2 (»i —- k) rt and 2 k re, 

which will be of opposite sign if o < k < ~ Thence we can infer the existence 
of at least two geometrically distinct series of points 

1 Sec my recent Acta article (loc. rit.). By the index of an invariant point is meant the total 
changes in angular direction of a line joining a point P to its image P , when P makes a small 
positive circuit of the invariant point. 

* See my paper in the Rendiconti di Palermo, loe. cit. 


342 



{ 


On the periodic motions of dynamical systems. 3G9 

P,T(P) t ...T— '(P) 

such that we have P-fP) = P, T»(0) = q while 9 hag been increased by 2ln . it is 
assumed that k and n are relatively prime. 

To prove this assertion in detail, we may let P be one such invariant point, 
such that T n increases <p bv 2 X- rr. If 

P.T(P).-.. T-'(P) 

are not distinct, let T- (/>) = />(„,<„_ ,) flnd SHppose ^ ^ increased by 
2/rr. By combination of the two symbolic equations T-(P) = P, T"(P) = P we obtain 
T'{P)=P where rf(+ i) is the greatest common divisor of mi and n. Thus P is 
invariant under 2*. Suppose that under V the y of P increases bv */,. From the 
equation we see that T* will then increase the y of P by * # /« so that 

* = Thus i and „ would possess a common factor, contrary to hypothesis. 

Also not only are the first series of ,, points distinct but these'have the 
same index. Hence there will be a point Q invariant of 2- and with oppositely 
signed index^ This with its images under successive powers of T will necessarily 

be distinct from the closed set generated by P, and leads to a second distinct 
set generated by Q. 

Hence we obtain for every and every relatively prime i < £ two geo.ne- 

trically distinct harmonic polygons with sides and making l circuits of the 
curve C. Corresponding to these there will be, of course, four periodic motions. 
We shall not attempt to develop here the characteristics as to type of stability 
and instability dependent upon the sign of the index. 

formatlotl 77 7 POi, “ 0 '“ the * eneral • i K" ifi '“ce of such ring trans- 

formations for dynamical system, with two degree of freedom. For such a sy. 

stem there are two space coordinates and two velocity coordinates „W 

These four quantities determine a state of motion, or a 'point' in the four di- 

meusiona manifold of states of motion. Each motion is represented as a 

these curves T \ ^ °* " 0tl0 "- B "‘ tl,e *»•**»! shows that 

we fix le t 1 three-dimensional sub-manifolds, upon one of which 

mangold 7 • ” Ca, ‘ fi " d a ‘^-ensional ring in the sub- 

I t e otl' ’* 7 tV ,WO C ‘° Sed CHrVeS ° f U, ° ti0n “ nd i* out bv 

P of he X ir ° 7 t,0n infinite,V ° ftol “ nd » ««• sense. A point 


343 



370 


George D. Hirkhoff. 


the ring’ again at P ,, defines a transformation T of the ring into itself, namely 
the transformation which takes P into its image P x . This transformation T and 
its powers will in general possess the properties necessary for the aplication 
of Poincare's last geometric theorem, and thus there must exist infinitely many 
further periodic motions. Unfortunately such a ring is not known to exist in 
general, although it does in some interesting cases. Furthermore it will he read¬ 
ily believed that the analytic labor of actually setting up the ring, and the 
transformation, is large. In the billiard ball problem, the integral used is that of 
constant energy, while the auxiliary periodic motions evidently are those obtained 
bv a rolling motion of the billiard ball around the ring in either sense. 

6. Application of the generalized theorem. 

For the application of the geometric theorem of Poincare used above it is 
essential to have tico stable periodic motions corresponding to the boundaries of 
the ring, and then to obtain the ring itself in case it exists. 

The generalized theorem differs essentially from the form so far employed 
here in that only one boundary of the ring, say the inner, is required to be in¬ 
variant under the transformation. However, it is required that the image of the 
outer circle under 7* be cut only once by any radius vector, and that its points 
be angularly advanced in the opposite sense from that of the inner circle. To 
apply the theorem it is only necessary to know a single stable periodic motion. 
The conclusion to be drawn from it is that there exist two and hence infinitely 
many other periodic motions in the immediate neighborhood of the given stable 
periodic motion. Thus every stable periodic motion is a cluster motion for infinitely 
many other periodic motions near to it, but in general making many circuits about 
it before reentering. 

We shall make use of the billiard ball problem again in applying the ge- 
neralized theorem, although obviously the method is entirely general. It does 
not seem evident that the same results can be obtained by the maximum-mini¬ 
mum method. In fact the success in application depends on details which do 
not seem to come into play in the use of maxiinuin-minimum considerations. 

A typical stable periodic orbit with which to start is the simplest one of 
stable type which traces out twice the chord crossing C where it is of least breadth. 
To it there corresponds an invariant point under T s of stable type. Considera¬ 
tions based only upon the existence of an invariant area integral show that for 
suitably taken coordinates, the transformation T may be given the form: 


344 



371 


On the periodic motions of dynamical systems. 

( i/, = m cos (^o —<■*•*)—v sin tyfc— ci-*) + P n (u.v) 

| v^u sin (ip 0 —cr*)+v cos (ty 0 —cr , )+ (J u (u, tt) 

near the invariant point taken at (o,o).« Here r* stands for n*+v * while P,„ Q H 
are convergent power series in w, v beginning with terms of the n th degree in 
u, v, with « arbitrarily large. The constant is supposed to be incommensurable 
with 2 7t, and c to be different from o. Both of these conditions are in general 
satisfied. When they fail to be satisfied although the motion be of stable type, 
the generalised theorem remains applicable, but this fact cannot be touched 
upon here. 

From the form of this transformation it is apparent that the circles ; S =X 
are carried into curves 

.■•+*. («.,)=* 

where /f„ is of the n th order iu «, *. Such a curve will differ only very slightly 
from the circle from which it came. In general it is apparent that in polar CO- 
ordmates the transformation T is very closely like the following: 


d'o-er 1 

near the origin. But the „ ,h power of this transformation rotate, the radial 
direction, at the invariant point by and on the circle by 

Hence if n is so large that for some integer /• 

»('l>* — CQ 0 t )<2kTC<nil> 0 

the original theorem of Poincare would be applicable, with the ring bounded by 
the circle, r-o and and with the transformation f- where I)., stands 

for a rotation through an angle and T for the transformation written 

explicitly above. Of course this ring may be expanded radially by a constant 
distance so as to yield the usual form of ring transformation 

tion oil U T w eSta r bliShe ' 1 that for sui,ab1 -'- »*•"" t and „ the approxima- 
ouJ, 1 actual transformation y is 50 close that the outer boundary, 
although not the original circle > is a cu „ f met 0 „ ce and ^ ^ by J y 

^ A M on.. Acta 


345 



M2 


George I). Birkhoff. 


ra«lius vector, while the inner and outer boundaries are still advanced in oppo¬ 
site sences. The details cannot of course be given here. 

Thus the generalized theorem shows that there exist two invariant points 
of T n within the circle r = p 0 , which advance rp by 2kn in an angular sense. 
Evidently the periodic motions so obtained are uniformly near to the stable mo¬ 
tion which we begun with. In this way we can infer the existence of infinitely 
many periodic motions in the immediate neighborhood of the simplest periodic 
motion of stable type, so that there are infinitely many harmonic polygons lying 
in the immediate neighborhood of the chord crossing the curve C at its narrowest 
part. A more careful examination of the asymptotic form of the transformation 
'/' near the invariant point shows that such further invariant points obtained with 
positive index are stable, while those of negative index are unstable. Hence it 
may he said in addition that infinitely many nearby motions are of stable type and 
infinitely many other are of unstable type. 

We may now start afresh with these new stable motions or with a stable 
motion corresponding to a known harmonic polygon with n> 2 , and discover 
further periodic motions by another application of the generalized theorem. In 
the next section we shall show how such a repetition leads to nearly periodic 
motions in the sense of Bohr, such as have not been proved hitherto to exist in 
dynamical problems. 

Before doing so. however, it is of especial interest in the billiard ball problem 
to discuss the limiting periodic motions corresponding to the rolling of the ball 
around the table in the two possible senses. We propose to outline how an 
application of the generalized theorem leads to the conclusion that there exist 
infinitely many periodic motions Hhifomih/ near to these rolling motions, so that 
the corresponding harmonic polygons of „ sides lie in the immediate neighborhood 
of (\ For this purpose it is essential to examine the explicit formulas given for 
T in the case when 0 is small. A direct computation leads to the result 

Tt c , Z +1 , 

where the function X(y») denotes the curvature of C at the point with given ip and 
where the functions depend on v only. Proceeding entirely formally and 


346 



373 


On the periodic motions of dynamical systems. 

replacing- 0,-0 and <p,- v by d6 and d<p respectively, we obtain the approximate 
differential equation: 

d<p 3 * 

which g-ives by integration 

9 “».**(»>). 

Here is 11 ralue of 9 for a P oint of curvature unity. This result indicates that, 

to a first approximation, the curve near the inner boundary 0 = oof 

the ring ,s nearly invariant under T. and can undoubtedly be modified slightly 

in higher order term, so as to be still more nearly invariant. Evidently the 

limiting motions formed by C must be regarded as analogous to stable periodic 
motions on this account. 

Also if the variable „ represents the number of iterations, we have the 
approximate differential equation 


whence by integration 


ft follows that „ will increase by more than along the approximate invariant 
curve if exceed, „ i* „ here , denotes maximum curvature 

It thus appears as highly probable that the generalised theorem is applicable 

wir;,~ zjjnrzr* - 

rr r : i": 

it the ball is projected from a noint P nf r , 


347 



374 


George D. BirkhofF. 


7. Existence of nearly periodic motions. 

Suppose now that we start with the infinite series of stable periodic motions, 
which are numerable in the general non-integrable problem. Let these be 

5,. S f . ... 

Choose a stable periodic motion St near to S, but distinct from it (which exists 
in virtue of the generalized theorem as indicated above). Now choose an Si extra¬ 
ordinarily near to S t and distinct from S t also. In this way we can construct 
a series 

Si, Si, S m , ... 

approaching a limit motion uniformly, call it S, and yet itself necessarily distinct 
from any periodic motion. Such a motion may be represented in the form 

V = lim /{ t), t = lira gj{x) 

jmm Jmm 

where p is any coordinate of the motion, where t denotes a periodic variuble of 
period 2 n along 5, say, and where 

f[x) and g,(x) — 

are periodic functions of x of period 2 rr/>. The integer lj denotes the number 
of times the j th motion of the sequence circulates about 5, before reentering, 
und tj denotes ita period. The convergence is uniform in x. 

It is clear from the manner of formation of these nearly periodic motions 
that they are non-denumerable, and constitute the class of uniform limits of 
periodic motions. 


8. The exceptional case. 

It has been stated that in order to apply the generalization of Poincare's 
geometric theorem to the neighborhood of a stable periodic motion, either an 
invariant c must not vanish, or at least one of an infinite set of similar con¬ 
stants must be different from zero. The exceptional case is that in which the 
period of the perturbed motion is independent of the constants of integration. 


348 



On the periodic motions of dynamical systems. 


375 


The following example illustrates this category; there are only two stable periodic 
motions, and no other periodic motion whatsoever. In particular the example 
shows that the conjecture of Poincare concerning the dense distribution of the 
periodic motions is not correct. 

Imagine a particle of mass m to move in a plane subject to a force derived 
from a potential energy: 


>n(lV + /V) 


where x and j , are the rectangular coordinates of the particle. The differential 
equations of motion are then 


dt * 




f'y = o. 


The particle moves so that its projection, on the x and y axes describe harmonic 
motions about the origin of period, and 2 f respectively. 

If we fix the energy constant K', we consider those solution, for which the 
relation 

holds. This equation show, that the motions all take place within an ellipse 

2 A* 


*V+fV- 


m 


Through each point of the ellipse there is one and only one motion in a given 
direction, for this particular value of the energy constant. Thus we obtain a 
dynamical problem somewhat analogous to the billiard ball problem, although 
the velocity „ now a known function of position and not a constant. Further- 
more the problem „, of course, integrate with general solution 

x-=A cos kt + B sin kt, y=C cos tt+D sin ft > 
with the constants A, B, C, D subject to the condition 


349 



George D. Birkhoff. 


376 

If we assume further that k and / are incommensurable with one another, 
obviously there can be no periodic motions except the two corresponding to the 
stable motions along the axes. 

This problem can be reduced to a ring transformation, but such a reduction 
is not necessary for our purposes. We merely note that the transformation T is 
essentially a rigid rotation of the ring through an angle incommensurable with 2 n. 


9. Some further results. 

Associated with the instable periodic motions are the two analytic families 
of motions asymptotic to them. Poincar£ pointed out that in the restricted 
problem of three bodies and for sufficiently small values of the parameters in¬ 



volved, these families intersect one another infinitely often, giving rise to •homo¬ 
clinic* motions, asymptotic to the given unstable periodic motion as time increases 
and as it decreases. It is certain that this intersection of the two families (in 
cuses where they do not coincide identically) is a phenomenon of general occurrence, 
although I have not as yet been able to treat certain exceptional cases. 

I will prove here that every homoclinic motion is always in the immediate 
neighborhood of infinitely many periodic motions. 

The fact just stated implies incidentally that the unstable periodic motion 
approached by the hoinoclinic motion lies in the immediate neighborhood of 
infinitely many other periodic motions, which cannot of course be entirely within 
a uniformly small neighborhood of this unstable motion. 

In proving the result we shall confine attention to the case in which a 
ring transformation is at hand, although this is not really essential to the argu¬ 
ment. In the figure the invariant point under T it represented by P, with the 
asymptotic branches a and w intersecting at Q by hypothesis. Now it is a pro- 


350 





On the periodic motions of dynamical systems. 


.'17 7 


perty of sucb an invariant point P that the behavior of nearly points under 
iteration of T is essentially the same as if T were of the type 


= c . r. 


V 


lo</>< i). 


In particular there will exist a family „f invariant curves like constant 

not perhaps analytic but possessing a high decree of regularity. The curvilinear 

pentagon PA BCD represents a region near the invariant point analogous ton 
region: 

//>w>0. ti>r>o nr<c 

with A B, Cl) analogous to r « a, u «. a w ;*h /» i v n 1 

, D ,, . * ,th 1 A ' J analogous to w ~ o, r — o 

and with B( analogous to u v = c. The dotted line «..#! • *1 • 

presents one of the invariant curves. "" S l ' en " , «° , “ 

It should be observed that as e approaches zero, the curve =. r approaches 

.- - 1 - 

But the outermost segment of these curve, = e t ,„ off , ... 

a certain number of segments each f i • « * ! • ^* S contn,, »* 

by r. while a Single ^Tal ^mTnt m ..*•*■«»« one 

with a smaller constant c. there will be^maTi " " " ""° th< ‘ r *"° h 8Pirrnent 

w ^tTt “ i,:,r: " int ,,f 


351 



378 


George D. Birkhoff. 


length. This maximum length exceeds ( q —i)/ because a polygon may be chosen 
made up of the longest chord taken q —i times together with one side of length 
zero. The length cannot be as much as ql, since l is the length of the longest 
chord. Nor can the maximizing polygon lie near the longest chord throughout 
its length, or it would, of course, have an even number of sides. On the contrary, 


one side of it must be at least as long as 


(<7-i)/ 


and so will be very near to the 


longest chord if q is large. Consequently there are infinitely many harmonic 
polygons lying in general very near to the longest chord, but leaving its imme¬ 
diate vicinity at least once. A refinement of this argument leads to the conclu¬ 
sion that there exist motions homoclinic to the fundamental unstable periodic 
motion also. It is not obvious that the maximum-minimum method will lead 
to like conclusions for the more complicated harmonic polygons corresponding 
to unstable periodic motions. 

The successful application of the above method for the derivation of addi¬ 
tional periodic motions of dynamical systems with two degrees of freedom re¬ 
quires the existence of motions homoclinic to unstable periodic motions. 

It will be noted that the requirement of an invariant area integral has not 
entered into the above reasoning, so that the criterion may be applied to diffe¬ 
rential systems, which are not associated with a dynamical problem. 


10. The totality of periodic motions. 

Thus if a dynamical system with two degrees admits of a single stable 
periodic motion of non exceptional type, it admits of infinitely many other stable 
periodic motions in its immediate vicinity. Consequently the totality of such 
stable periodic motions forms a set dense in itself, with nearly periodic limiting 
motions of the Bohr type. Each stable periodic motion has also infinitely many 
other unstable periodic motions in its vicinity, which in turn will be approached 
(but not uniformly) by infinitely many periodic motions, at least if certain homo¬ 
clinic motions exist. 

It still remains an open question as to whether or not the periodic motions 
are densely distributed throughout the possible motions. This cannot be true 


352 



On the periodic motions of dynamical systems. 379 

unconditionally, as the example given above makes clear. On the other hand I 
have shown that the periodic motions together with those asymptotic to them 
are everywhere dense in the transitive case. 1 


nearly on^n^gn^" nearly any " T”"* * f ° UDd ****'** ^ 



353 




R 

Reprinted from Acta Litt. ac. Scientiarum , sect. Scientiarum mathe - 
madcarum , Szeged, 15 August, 1928, Vol. 4, pp. 6-11. 


A remark on the dynamical role of Poincare’s 
last geometric theorem.*) 

By G. D. Birkhoff (Cambridge, Mass.). 

Consider a dynamical system with one degree of freedom ; 
the equations of motion in the canonical form of Hamilton are: 

dp _ OH dq dH 

dt Oq dt Op 

where H = H(p,q) is a function of p and q. From the condition 
that the energy H(p,q) is constant, the solution can be obtained 
by a quadrature, and this case does not offer any especial interest. 

For the case of two degrees of freedom assume the equations 
of motion in the HAMiLTONian form: 

dp, _ OH dq L _dfl , ov 

dt dq, 1 dt dpi K ’ 

where H = H (p lt q u p 2 , q 2 ). Consider p lt q it p 2t q 2 as coordinates 
in space of four dimensions. In the neighborhood of a periodic 
solution the values p X} q Xt p 2t q 2 corresponding to the different states 
of motion correspond to a three-dimensional torus, in consequence 
of the energy relation. As is well known, the problem can then 
be reduced to the HAMiLTONian case n= 1, namely 

,!, dp dH dq dH 

' dt dq 1 dt dp 

where H = H(p,q,t), t being an angular variable of period 2n 
which measures the distance of two points along the three-dimen- 

•) Lecture delivered at the meeting on June 8, 1928 of the Mathematical 
Seminary of the University of Szeged. 


354 



G. D. Birkhoff: On the dynamical role of Poincare’s theorem. 


7 


sional torus. The given periodic motion can be made to correspond 
to p = q = 0. 

We may represent this system in the following form: 


( 2 ) 



dq_ 

dx 



1 


where 



OH 

Oq 


and Q 


OH 

Op 


and r replaces t in P and Q. 

Let us consider p , q, r as rectangular coordinates in space of 
three dimensions. 

The above equations give the direction of a stream line at 
every point of the (p,q,r)- space. The motions of the dynamical 
system are interpreted as the stream lines of a three-dimensional 
fluid in steady motion. 

Consider the planes r = 0 and r=2n\ two points of these 
planes which have the same coordinates (p,q) are to be considered 
as congruent; they correspond to the same state of motion in 
consequence of the periodicity in t. 

Take a point P of the plane r = 0 whose coordinates we 
denote by (p, q), and follow the stream line starting from P up 
to the point P, with coordinates (Pi,qJ in which it meets the 
plane r=2rr. The correspondence between the points P and P, 
furnishes a transformation T of the (p t q )-plane into itself. For 
this transformation the origin p = q = 0 is an invariant point. 

The transformation T has two important properties. In the 
first place the quantities P, Q, P satisfy to the relation 


0P + *Q + d£_ = 

Op ^ Oq ^ Or ”• 


This means that the stream flow in space for which the velocity 
components are P, Q, P, is that of an incompressible fluid. 

Secondly if we consider a small closed curve in the plane 
r = 0 and the cylinder of height h bounded by the stream lines 
passing through the points of this curve and the planes r= 0 
and r = h, it is clear that after an interval of time 2n this cylin¬ 
der goes into a like cylinder of height h with an equal area o x of 
the corresponding curve in the plane 2n as base; this is a 


355 



G. D. Birkhoff 


consequence of incompressibility. In other words the transformation 
T preserves areas. Consequently the Jacobian 



is I. 

Hence to the dynamical problem corresponds a certain area¬ 
preserving transformation T of the (p,q )-plane into itself with 
invariant point at the origin. To the important properties of the 
dynamical system for motions near to the given periodic motion 
correspond properties of T. 

If H is analytic in p,q.r then p x ,q x will also be analytic in 
/?, q. Likewise if H is continuous together with all of its partial 
derivatives, the same will be true of p x ,q x . 

There arises now the interesting question as to whether or 
not there exists conversely a dynamical problem of this type for 
every such transformation T. The following result in this connec¬ 
tion will be proved: 

// 

Px = <p(p>q)> qx = 'i , (p>q) 

is an area-preserving transformation T such that <p, are conti¬ 
nuous together with all of their partial derivatives, while the origin 
p = q = 0 is an invariant point, then there exists a corresponding 
dynamical system (I) such that H is continuous together with all 
of its partial derivatives in p, q, t and periodic of period 2 tj in t. 

It would be of decided interest to establish a like result in 
the analytic case. 

In the neighborhood of the origin the transformation T from 
(p,q) to (p u q x ) is essentially an affine transformation of determi¬ 
nant 1. Such a linear transformation can always be obtained by 
a one-to-one analytic deformation ('rotation or stretching) which 
takes each point (p,q) into its transformed point (p x ,q x ) while a 
parameter r varies from 0 to 2n. Upon this transformation may 
be superimposed the very small displacement with components 

(qi qO- 

The combined transformation depending upon the parameter r 
leaves the origin invariant, yields a one-to-one transformation of 

356 



On the dynamical rdle of Poincare’s theorem. 


9 


ilie neighborhood of the origin into itself, and takes (p, q) into 
(p lt q y ) as r increases form 0 to 2tl. 

As r varies from 0 to 27 t, the points (p,q,0 describe arcs 
of curves joining (p, q, 0) to (p lt q lt 2n) in such a way that r in¬ 
creases, and the complete neighborhood of the r axis for 0^r^2n 
is filled by these in a one-to-one manner. If we set down all 
congruent arcs of curves obtained by a translation of space by a 
distance 2 kn (k = ± 1 ,± 2 ,...) in the direction of the positive 
/ axis, all of (p, q , r) space in the neighborhood of the r axis is 
filled by curves made up of such arcs whose equations will have 
the form 

p =f(0> q=g(0 

in which / and g are continuous together with all of their deri¬ 
vatives except for r = 0, ±2ti, ... where there may exist finite 
jumps in the derivatives. 

Now imagine a deformation of the region 0^.r< s 2n of 
(p t q, r) space in the direction of the r axis, in accordance with 
the formula 

r—kj 

where the constant k is so selected that for q = 2ti, r is 2n also. 
Evidently r is thereby defined as a function of q, continuous 
together with all of its derivatives for 0^e^27?, while all of 
these derivatives vanish for both q = 0 and q = 2n. 

When this deformation of (p,q,r) space is made for Q^r&2n‘ 
together with the corresponding congruent deformations of the 
regions 

2kn^r^2(k+ \)n, (k = ± 1, + 2,...), 
a modified set of curves is obtained in which the functions f,g 
involved are everywhere continuous together with all of their 
derivatives. 

Now let each point P move along its curve in the direction 
of the r axis with unit velocity. Any area cr in the plane r = 0 is 
thus carried into an area in any plane r=r 0 . But we have 

jjdpdq=jjjdpdq 

O *0 

where J(p,q,r) denotes the corresponding Jacobian. Consequently 
it is evident that the triple integral 


357 



10 


G. D. Birkhoff 


jjjj(P.<J.r)dpdqdr 

is invariant. Here J is not only continuous together with all of 
its derivatives, but is periodic in r of period 2n since J (p, q, 2n)= l 
by hypothesis. 

Suppose now that we deform (p, q t r) space in the direction 
of q axis so that 

* 

Q=\j(p,q,r)dq. 

.0 

This deformation is evidently periodic in the desired sense and 
does not affect points in the planes r = 2kn(k = 0, + 1, ±2 ,...). 
The new invariant integral is then simply the ordinary volume 
integral JJj dpdqdr in the modified variables. 

The corresponding differential equations are 


*P_ = P dc * n AL 

dx ~ n dx dx 


1 


where P, Q are continuous functions of p, q, r , together with their 
partial derivatives of all orders, and periodic of period 2n in r. 
Since volumes are invariant we have of course 


0P_.dQ 
dp dq 


0 . 


This means that a function H of the same type exists for which 


P= -i” Q== m. 

dq' W dp 

In other words the given area-preserving transformation may be 
associated with a dynamical problem of the stated type. 

This remark shows that the area-preserving property of the 
transformation used by Poincar£ in his last geometric theorem is 
really its characteristic property. The remark shows also how the 
dynamical problem leads to the consideration of a transformation 
near an invariant point, or near a single closed invariant curve 
into which such a point may be expanded, rather than to a trans¬ 
formation defined over a complete ring as required by Poincare 
It was for this reason that I developed a modification of 
Po:ncar£’s theorem for a transformation of this less restricted type,, 
which seems more appropriate to many of the actual dynamical 
applications. Indeed the more detailed consideration of these 
applications shows that for many purposes the use of Poincares 


358 



On the dynamical rdle of PoincarE’s theorem. 


11 


last geometric theorem in a modified self-evident form will suffice, 
once the analytic details are developed. 1 ) 

From a general topological point of view the plane trans¬ 
lation theorem of Brouwer may be looked upon as dealing with 
Ihe morphology of continuous one-to-one transformations of the 
sphere with only a single invariant point. In the same way a 
suitable extension of the theorem of Poincare 2 ) throws light on 
the morphology of any such transformation of the sphere with 
only two such points. An important new method of attack devised 
by KerEkjartO but not yet published 3 ) seems to afford a means 
of treating these and other similar questions on a common fun¬ 
damental basis. 


(Received June 12, 1928) 


*) Cf. chapter 6 of my book on Dynamical Systems , New-York, 1927. 
£ ) See my paper: An extension of PoincarE's last geometric theorem, 
Acta Mathematica 47 (1925), p. 297. 

3 ) See p. 86 of this volume (note of the editors). 


359 



Reprinted from Jour, de Math. Pures et Appliq uees , s. 9, 1928, 
Vol., pp. 345-379. 


Structure /Jnalysis of Surface Transformations; 

By Geor«;e D. B1RKHOFF 

(Cambridge, U. S. A.). 

and Paul A. SMITH 

* (New York. 15. S. \.). 


Introduction. 

11 is intended to set forth in this paper certain general facts concerning 
the structure of one-to-one continuous transformations of surfaces into 
themselves, especially as regards the movement of points under inde¬ 
finite iteration. For the most part, the transformations with which 
we shall deal are non-analylic. The restriction of analyticity, 
however, reduces the possible structural complexity to such an extent 
that something like a systematic structure analysis in the general 
analytic case can conceivably be developed.. At the end of the paper 
we shall indicate in the briefest way some of possibilities with regard 
to a systematic study of this sort, altho the paper as a whole may be 
regarded as preliminary to such a study. 

Surface transformations which arc associated with certain types of 
dynamical problems have the properly that they admit an invariant 
area integral. These « conservative » transformations, which have 
beenstudied by Poincare(11)(') and morecxlensively by Birkholl (lV), 
possess a fundamental property of regional recurrence. In general, 

('» Tin* ron inn numerals refer lo I he li«l of refon-ners found at the end of lliis 
paper. 

Journ. de .Hath., tome VII. — Fasc. IV, i*,*8. 44 


360 


GEORGE L>. BIRKHOFF AND PAUL A. SMITH. 


34 <3 

however, llie phenomenon of recurrence does not extend to the entire 
surface, but docs nevertheless take place within certain invariant 
subsets. We shall undertake a study of these subsets, — the precise 
nature of the recurrence which takes place within them, l lie extent to 
which they may be considered as transforming conservatively, the 
general movement of the remaining points of the surface and questions 
of uniform approach. Finally, we shall apply our general principles 
in a brief examination of the structure of simple types of analytic 
transformations. 

I. Preliminary definitions. The general analytic case. — Through¬ 
out this paper, the term « surface » (denoted consistently by S) 
will mean a closed orientable surface of arbitrary genus and the term 
<« transformation » (denoted b\ T) will mean a one-to-one continuous 
sense-preserving transformation of such a surface into itself. 

We shall designate bv T 3 , T a , ... the successive powers of T, and 
bv T_ a , T_ a , . . . those of the inverse T_,. \n infinite sequence of 
points of the form 

.... IV., I* lt t\ l\, P., ... 

where P is an arbitrary point of S, and P„= T„(P), will be called a 
complete sequence. 

If two points P, and P t (a < 3) of a complete sequence coincide, 
then so do P«j_ a and P. Let k be the smallest positive integer for 
which -T*( P )= P* = P. Then P will iterate periodically thru a 
set of k distinct points, and is therefore called a periodic point of the 
order k. 

Any limit point of the infinite sequence P, P,, P a , — is called 
an co-limit point of P anti any limit point of the sequence P, P_,, 
P n ... is an (/.-limit point. In case P is periodic, each of its images 
is to be considered an a-and an co-limit point, and there are no others. 

A complete sequence together with its a-and co-limit points will be 
called a complete group. 

A set of points E is invariant under T if E and T (E) are identical 
point sets. A complete group, for example, is a closed invariant set. 
If a point P is contained in an invariant set, so is the complete sequence 
ol P. An invariant set, however, need contain no invariant or 


361 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 3^7 

periodic points. In case E is invariant under T*, but not under T, 
T,, ..., T*_,, it is periodic, and of the order k. 

Suppose that a simply connected region G on S has the property 
that it contains the transformed region T(G) together with the 
boundary of T(G). Then G, together with its boundary is contained 
in T_,(G). We shall call Ga contracting region under T, or an 
expanding region under T_,. 

We shall now summarize certain facts and definitions concerning 
analytic surface transformations. In the neighborhood of an invariant 
point O, T may be represented in terms of a properly chosen coor¬ 
dinate system as follows : 


//, = au -+• bv -+- ... 
= cu dv - f-... 


(ad — bc> o), 


where the right hand members are convergent power series with 
real coefficients, and //,, <»,, are the coordinates of the transformed 
point. 

Let fx and v be the roots of the characteristic equation 


/* — (a d)l ad — be = o. 


We shall say that O is of general type provided that p. and v arc 
distinct and of modulus different from i. In any other case O is of 
special type. If p. and v are real, and o <£ p. < i <v> O is called a 
directly unstable point; if p. < — i < v < o, O is inversely unstable. 
All other invariant points of general type are called stable. An 
invariant point .which is inversely unstable under T, is directly 
unstable* under T 2 , while stable and directly unstable points retain 
their type under any power of T. 

The definitions above are independent of the coordinate system. 
They extend, moreover, to periodic points of any order; for example, 
if O is of order X*, we take the roots p. and v relative to T*. 

Suppose O is an invariant point of stable type. Then points in the 
neighborhood of O converge toward O on indefinite iteration of T 
(or of T_,) in such a way that the region bounded by any sufficiently 
small circle about O is contracting under T (or T_, )(See reference to 
Lattes). 


362 



3^8 GEORGE D. BIRKHOFF AND PAUL A. SMITH. 

Suppose next that O is of directly unstable type. There abut at O 
four invariant curves or branches ( 4 ), — two a -branches whose points 
converge toward O on indefinite iteration of T_,, and two ^-branches , 
whose points converge toward O on iteration of T. A point con¬ 
tained in a small neighborhood a of O, but not on one of the invariant 
branches, is carried out of a on repeated iteration of either T or T_,. 

The two a-branches are analytic continuations of each other at O, 
and taken together, form an invariant analytic curve without singu¬ 
larities, which at O crosses the corresponding curve formed by the 
two co-branches. If two sufficiently small arcs c/, and a 2 of the 
two a-branches respectively, abut at O, they will not intersect. It 
follows that the two a-branches can nowhere intersect each other; 
for if F were a point of intersection, an image of P under a sufficiently 
great power of T_, would be in both a, and a 2 w hich is impossible. 
Moreover, an a-branch of O can not intersect an a-branch of some 
other point, say Q, for a point of intersection would be carried simul¬ 
taneously toward O and Q on indefinite iteration of T_,, which is 
impossible. The same holds for the co-branches. 

An inversely unstable invariant point has a-branches of order 2 , 
— they are invariant under T, but not T. An unstable periodic point () 
of order k has a-and co-branches of order k and a A*, according as O is 
directly or inversely unstable. If O is of the former type, a point on 
any of its branches tends asymptotically toward the complete group 
of O on repeated iteration of T(or T ,), and toward O itself, on 
iteration of T*(or T_.*). 

If we consider the totality of a-and co-branches of all orders 
on S, it is clear from the discussion above, that no a-(or co-) branch 
can intersect another a-(or co-) branch. However, an a-branch 
may intersect an co-branch, and the points of intersection in such a 
case are called doubly asymptotic (d. a.) points (Poincare, II). If 
the two branches which intersect at a d. a. point P .actually cross, i. e., 
are not coincident or merely tangent at P, then P will be said to be of 
general type: in the contrary case, P is of special type. 

(*) First proved l»y Poi nr nre, «I). Willi n-jraril iollie>e imariant brnnelie*, »«•«• also 
Poincare (Hi and I .a lies. 


363 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. .V|<) 

An analytic surface transformation which admits no d. a. or periodic 
points of special type will he said to belong to the general analytic 
case. The sense in which such transformations may he considered 
« general » is indicated by the fact that a d. a. point of special type 
can he converted into a number of d. a. points of general type by an 
arbitrarily slight modification of T, but not conversely; the same 
holds for the periodic points. We shall not, however, here consider 
the situation in further detail. 

A d. a. point is called hornoclinic (Poincare, II) if the a-and co- 
branches on which it lies issue from the same unstable periodic point 
or from two points belonging to one and the same periodic group. 
For convenience, let us say that homoclinic points of the former type 
arc simple. 

We shall have occasion to refer later to the following theorem : 

In the general analytic case , an arbitrarily small neighborhood of a 
homoclinic point contains infinitely many periodic points. 

A proof of this theorem is given by Birkhofl'(V) for the case of 
simple homoclinic points ('). The remaining cases are disposed of 
by the following lemma : 

In the general analytic case, an arbitrarily small neighborhood of a 
homoclinic point contains a hornoclinic point of simple type. 

We shall briefly indicate the proof. Suppose for concreteness 
that P is a point of intersection of an a-branch of O with an co- 
brancli of O, = T(O) where O is a directly unstable point of orders. 
Thus P is a non-simple hornoclinic point. The transform of 
the a-branch OP is the a-branch O, P, and the transform of the 
co-branch O, P is the co-branch OP,. (See Jig. i). 

Now O is invariant under T,, and P,, being on an co-branch of O 
is carried toward O on repeated iteration of T,. Moreover, a point 
sufficiently close to O and on the proper side of the curve OF',, is 
carried toward and beyond P on iteration of T 3 , the successive images 


11 ) Tin- proof ”iwn aclinic* the rxi'tence of nn invariant iulcurnl, lint ran hr 
extended lo the general ra*r: detail' for this will appear elsewhere. 


364 



GEORGE D. BIRKHOFF AND PAUL A. SMITH. 


35 o 

remaining close to OP. Hence a small arc (3 crossing OP, at P,, 
will eventually he carried into an arc J3* which follows along close 
to OP sufficiently far that it will cross PO, near P, sa\ at M. Since 


Fig. i. 



the d. a. point P, is of general type, the a-branch O, 1\ actually 
crosses OP, at P, and hence (3 may he taken as ail arc of O, P,. 
Hence (3* is also an arc of the a-branch O, P, and M is therefore a 
homoclinic point of simple type. This establishes the lemma for the 
case considered, and there is no difficulty in making the proof general. 

2. The central motions ('). 

Consider an arbitrary connected-region a on S. It may happen 
that o is intersected by none of its images 

...» a,, a,, .... 

in which case, a is called a wandering region and its points wandering 
points. A point of S which is contained in no wandering region is a 
non-wandering point. 

No two images of a wandering region can intersect. For if cx, 
and <j j(i<j) intersect, then so do ? and <j y _ ( which is impossible. 
Consequently any image of a wandering region or point is again wan¬ 
dering, and any image of a non-wandering point is non-wandering. 

i*) Cf. BirkhofT, IV. 


365 


STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 351 

Among the non-wandering points are the a-and <o-limit points. 
Suppose for example that L is an cu-limit point and let o he an 
arbitrarily small neighborhood of L. Within z there are infinitely 
many points of some sequence P, P,, P„ ... If P a and P s (a < (3) 
are two of these points in a, the regions a 4 _ a and a obviously overlap, 
and hence z can not be wandering. 

The totality of non-wandering points oj S constitutes a non-null 
closed invariant set M' towards which all other points tend asymptotically 
on indefinite iteration of T o/T_,. 

In the first place, non-wandering points must exist on S, for there 
are always a-aird co-limit points on a closed surface. The set 
S — M' consists only of inner points, and hence M' is closed. 
Since all images of a non-wandering point are non-wandering, M* is 
invariant. Finally, if P is a point of S — M', the sequence P, P,, ... 
tends asymptotically toward M'. For otherwise a number o>o 
exists and an infinite subsequence of points P, ,Pj, . . . each of which 
is at u distance greater than 3 from M'. No limit point of the 
subsequence can belong to M'; but on the other hand, every such 
limit point is an co-limit point and therefore non-wandering. This 
contradiction proves our assertion. If follows similarly that the 
sequence P, P ,, ... lends asy mplolically toward M'. 

The following theorem concerns the movement of W' = S — M' as 
a wdiole, on indefinite iteration. 

Theorem 1. — Xol more than l points of a complete sequence of 
wandering points can be outside a given neighborhood \ of M 1 , where k 
depends only on the choice of \ . 

Vroof. Suppose the theorem false. Then there exist complete 
sequences which have more than N points in W 1 — T. wdiere N is 
arbitrary. 'Thus, for every positive integer //, there is a set E" consis¬ 
ting of at least n points taken from a complete sequence, and all 
contained in \\ 1 — \ . We shall pick from each E" a pair of points P" 
and Q" chosen such that the distance P" Q" shall converge to zero 
with i/n. This is possible since the number of points in E" grows 
indefinitely with n. 


366 



GEORGE D. BIRKHOFF AND PAUL A. SMITH. 


352 

Now the set W — V is closed and any limit point L of the 
sequence P', P 3 , . .., must therefore be wandering. But on the other 
hand a neighborhood of L, however small, will contain a pair P" Q"; 
and since Q" belongs to the same sequence as P", a certain power of T 
or T_, will carry P" into Q", and hence into a region that overlaps a. 
Therefore L is non-wandering, which is a contradiction. 

It may of course happen that the set M' is identical with S. This is 
the case, for example, when T possesses an invariant integral of a 
certain type, as we shall see later. 

Let us suppose now that M' is not identical with S, and let us take 
the set M as fundamental instead of S. A connected region which 
contains points of M will be called wandering with respect to M' if the 
set ffM of points common to a and M' is intersected by none of its 
images under powers of T or T_,. The points of M* which are 
contained in such a region are called wandering with respect to M\ and 
their totality will be denoted by W*. ThesetM a = M‘— W 9 consists 
of the points which are non-wandering with respect to M'. In 
case M' = M\ we shall say that M' is non-wandering with respect 
to itself. 

The complete analogy which exists between S, M', \V', and M', 
M 3 , YV 3 , will be seen immediately; M- is a non-null invariant closed 
subset of M', and toward M* the points of W* tend asymptotically on 
indefinite iteration of T or T.,. 

In case M 3 is not identical with M 1 , the process may be carried one 
step farther, yielding the set M 3 of points which are non-wandering 
with respect to M 3 . We continue thus until we arrive at the set M' 
which is non-wandering with respect to itself. I if case, however, 
that no such set appears after a finite number of steps, we shall have 
an infinite sequence M‘, M% ... with M’>M 3 >... The 
set M w = M, M|..., is closed and not null, and our process applied 
to M* yields M—\ then M-* 2 and so on. 

In this manner we obtain an ordered aggregate of point sets 

M‘, .... M M M“*'. . jv|w\ 

Kac h set is a proper subset of all those proceeding it. Such an 
W e 8 alec an he at most denumerable, and hence, when arranged as 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


353 


above in a well-ordered sequence, is associated with a definite ordinal r 
of Cantor’s second ordinal class. Thus the sequence above terminates 
with M r which therefore must be non-wandering with respect to 
itself. The points of M' will be called central points , and a complete 
sequence of central points will be called a central motion. 

Among the closed invariant point sets which are non-wandering 
with respect to themselves, the set \l r is maximal in the sense that 
every such set is contained in M r . For if E is such a set, we have 
successively E<M’, E<M% ..., E<M W , ..., and hence E<M r . 
Moreover any closed invariant set on S may be taken as the initial set 
in the above process, and hence must contain a subset which is non¬ 
wandering with respect to itself. It follows that every closed invariant 
set contains at least one central motion. This applies in particular to a 
complete group. 

A study of the structure of M' and the sequence M‘, M a , .. ., which 
determines M' will occupy much of our attention in what follows. 
We shall first prove a fundamental recurrence property of M r . 

A point which is both an a-and co-limit point of its own 
complete sequence will be called pseudo-recurrent ( 1 ). The characte¬ 
ristic property of such a point is that it returns infinitely often into an 
arbitrarily small neighborhood of itself under indefinite iteration of T 
as well as T_,. All images of a pseudo-recurrent point are pseudo- 
recurrent. Moreover, a pseudo-recurrent point is a central point, for 
its complete group is obviously non-wandering with respect to itself 
and is therefore contained in M r . 

The fundamental recurrence properly of M r may now be stated as 
follows : 

I HEOKF.M 2. — The set E which consists of the pscudo-recurrem points 
together with the limit points of pseudo-recurrent points is identical 
with M r . 

Proof. First, since pseudo-recurrent points are central motions 
and since M r is closed, we have E<M r . 


(') Tl,e « recurrent > has been used elsewhere ( BirkliofT, III) with a slightly 
different meaning. Recurrent points are pseudo-recurrent, but not conversely. 
Journ. de Math., tome VII.— Fasc. IV, 1938. 4 $ 


368 


GEORGS D. BIRKIIOFF AND PAUL A. SMITH. 


354 

It remains to prove that M r < E. We shall show that an arbitrarily 
small neighborhood a of a central point A contains at least one pseudo- 
recurrent point, so that A is either pseudo-recurrent, or a limit point 
of pseudo-recurrent points, and is therefore in E. 

The set of central points contained in any small region, — for 
example <j, must intersect images of itself under powers of T and T_,, 
for there are no wandering regions with respect to M r . Hence there 
exist in a a pair of central points P and Q which are images, one of 
the other, under some power of T. We shall assume that P 
preceeds Q ('). 

Next choose about P a neighborhood p so small that both p and q, 
the corresponding neighborhood of Q, shall be contained in a. There 
exists in p a pair P' and Q' of central points, images one of the other, 
under some power of T. We shall Suppose this time that P* is 
preceeded by Q*. 

We shall describe one more step in detail. A neighborhood/?' of P* 
is chosen so small that both p' and q' shall be contained in p. In p ' 
are P a and Q a , images one of the other under some power of T, and 
nam'ed so that P* preceeds Q*. The important point in the choice of 
successive pairs P* and Q‘ is to name them in such a way that P a " 
preceeds Q a/ ', while P in +* is preceeded by Q’"-' 

In continuing thus, we choose the successive neighborhoodsp, p ', ..., 
in such a way that the diameter of p' shall converge to zero as i -*■ x. 
By the manner in which these neighborhoods are defined, we have 

<0 °>P >p l >p r >..., 

( 2 ) a > <1* P> - •> P x > q*+\. . . 

Now there must exist at least one point L with the property that it 
lies in or on the boundary of each neighborhood of the sequence (i). 
We shall show that L is pseudo-recurrent, and thus establish our 
theorem. 

We must show, then, that given an arbitrarily small neighborhood >. 


<!> '■ 0 whcn ,he con 'P 1 "' sequence to which P and Q belong is written accord,n- 
to increasing powers of T. In case P coincides with Q, we shall sav that P 
precedes and is preceded by Q. 


369 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 355 

of L, there are in X images of L under powers of T as well as T_,. 
Let a be a positive integer chosen so large that p* together with its 
boundary shall be contained in X. Then by (i) and (2), the 
regions p**', q % *', />*-*, q x ~ 3 , are all contained in p* and hence in X. 

Suppose that a is even. Then q**' preceeds p x *' and p a 
preceeds q** 1 . Therefore, since L is in or on the boundary of 
both p*~' and />*-*, the image of L under some power of T_, is in or 
on the boundary of q **', hence in X; and the image of L under some 
power of T is in or on the boundary of <7** 3 , hence in X. The situation 
is reversed if a is odd. This completes the proof. 

We shall now prove a theorem concerning the distribution of 
pseudo-recurrent points in the following important special case : T is 
analytic and possesses an invariant integral defined over a closed 
invariant set E, where E is measurable in the sense of Lebesgue and of 
non-zero measure. We assume specifically that the integral is of the 
form 

f <?(P)d<x, o<«< 9 (P)<jS on E, ((3 finite). 

* r 

the function © being defined and measurable on E. 

The set of pseudo-recurrent points contained in E is measurable , and 
its measure is equal to m ( E). 

ProoJ. Suppose f is a measurable subset of E, with m(e)> o. 
Then c must intersect images of itself under powers of T and T_,. 
For if the sets e, c, t . . ., were mutallv exclusive, the sum 

i/'t 



could not be finite, since j = j = ( = ... . But in contradiction to 

* ' ‘'*'1 

this we have 

y. foil*) dv\< f 9 ( I* ) da < {3 m (E). 


Now let £|, c 3 , .... he a sequence of positive numbers converging 
to zero and let H‘ consist of those points of E which never come within 


370 



GEORGE D. BIRKHOFF AND PAUL A. SMITH. 


356 

a distance e' from their initial positions on iteration of T. We shall 
show that m ( H') = o for every i. For suppose that the outer measure 
of H‘, say, is o. Then H 1 possesses a measurable subset H of non¬ 
zero measure ('). 

Now, we can obviously choose u simply connected region ? of 
diameter smaller than for which ///(aH)>o. By the remark 
above, ?H must intersect an image of itself under some power of T, 
and this same power of T therefore carries some point of c H back into 
This is impossible by definition of H'. Hence m ( H) = //< ( H') = o. 

It follows that ) =//i(K), where K = K—( II'-p 

Each point of K is clearly an w-limit point of its own complete 
sequence. By entirely similar reasoning we arrive with a set L 
with m(L) = m(E), each of whose points is an a-limit point of its 
own complete sequence. 

Since the sets K and L are both contained in E, and are in measure 
equal to/w(E), they must overlap to the extent that »i(KL) = m(E\ 
The points of KL are of course pseudo-recurrent, which establishes 
the theorem. 

For a conservative transformation, K is identical with S, and there¬ 
fore the measure of the pseudo-recurrent points equals the total surface 
area of S. This is closely related to the statement of Poincare (II ) 
that in certain dynamical problems, there exists stability in the sense 
of Poisson, except for « motions of zero probability ». 

I iieorem. At least one point of every set ofk successive points of a 
complete sequence falls in a given neighborhood \ o f the set M\ The 
value of l depends only on V. 

hoof At least one point of every complete group must fall in V. 
since a complete group contains at least one central motion. Now if 
the theorem were false, there would be sequences of the form P, 

•••» I*** N being arbitrarily large, which have no points in Y. 
Let 



(•) See Carathkodohv, Vorlesungen iiber rcette Funklionen. 


371 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


35 7 

be an infinite succession of such sequences, with m y < /w a <.... Then 
clearly the complete group of any limit point of the set P, Q, . . 
must be entirely outside of V, which is impossible. 

3. The sequence N', N a , .... — We shall now define the set M' of 
central motions by means of a new fundamental sequence, and in so 
doing, we shall reveal certain additional properties of M'. 

We shall employ a modification of our earlier process, which 
consists in considering those, non-wandering points which are a- or 
co- limit points. If to the set of a-and co-limit points of S, we add the 
ordinary limit points of such points, we obtain a closed invariant 
set N'. This set is-of course contained in M', altho the two sets may 
be identical. (We shall consider later the conditions under which 
this must happen.) 

By the same argument which we used for M', it follows that the 
points of S — iV tend asymptotically toward N' on indefinite iteration 
of T or T_,; we do not, however, have a theorem analogous to 
Theorem I, § 2 for this case. 

Let us now take the set V as fundamental. An a- limit point with 
respect to IV is a limit point of some sequence P, P_,, . . ., contained 
in N'. If to the x- and to- limit points with respect to t\*, we add 
their limit points (in the ordinary sense) we obtain a closed invariant 
set IV contained in IN"; it is easilv verified that N a is contained also, 
in M*. 

in the light of the preceeding section, the manner of proceedure is 
clear. We arrive eventually with a closed invariant set N% with 
which the process terminates, — i. e. such that [V~'= IV. 

Theorem. — The sets iV and M r are identical. 

ProoJ. Since IV consists of x- and co-limit points with respect to 
itself, and the ordinary limit points of such points, an arbitrarily 
small region a which contains points of N% contains at least one limit 
point of some complete sequence contained in N\ Hence a IN' must 
intersect images of itself under powers of T or T_,. Thus N is non¬ 
wandering with respect to itself and therefore N‘< M r (§ 2). 

Next, it is clear from their definition that all pseudo-recurrent 
points belong to IN’*. Since N* is closed, limit points of pseudo-recurrent 


372 



GEORGE D. BIRKHOFF AND PAUL A. SMITH. 


358 

points are also contained in N\ Therefore, In Theorem :2, £ 2, 
M r < N'. Hence M r = N\ 

Between the ordinals /• and we have the relation *</•; il is 
probable that s may be actually less tlian / in certain cases. 

We shall now prove a simple lemma preliminary to obtaining a 
further property of M' : 

Lf.mma. — If Q is a limit point of a complete sequence 1 relative 
to T, it is also a limit point of a complete sequence X' relative to T A , 
/• being any integer, and I' being a subsequence of I. 

Let X be the sequence ..P_ 2 , P_,, P, !>,. ..and from il let us 
extract a subsequence P„ \\ . .., which con\erges to (). The 
subsequences 

.... P-XW. Pf, Pl^. P,l»|. ... (/" — o . i./. — I ) 

constitute a set ot k complete sequences of T 4f and each is a subsequence 
ol i. la ken together, these sequences contain all the points of 1, 
and hence at least one of them contains infinitely many points of 
the sequence P a . \\ . . . and so has Q for a limit point. 

Theorem 3. — 7 he sets i\ 1 , N 7 ,..., relative to T are identical respectively 
hi th the sets iV. i\ 3 , . .., relative to T*. Hence the set of centra! 
points relative to T is identical to the set relative to T,. 

Proof. If Q is an a- or co- limit point of T*, it is of course an z- or 
co- limit point of T. By virtue of the lemma, the converse is also 
true. Hence N' = i\'. Next, if Q is an*-or limit point of T* with 
respect to N', it is an z- or <o- limit point of T with respect to \ '. 
Again, the converse of this statement follows from the lemma, and wc 

have N 2 =N a . For suppose Q to be a limit point of a complete 
sequence 2 of T, contained in V. Then by the lemma Q is also a 
limit point of a complete sequence Z' of T*, where I' is contained in I 
and hence in N 1 . 1 Inis Q is an *- or to- limit point of T,. with respect 

to N\ as stated. Proceeding in an entirely similar manner for V, 
iV, . . ., the proof'of our theorem is established. 


373 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 35 $ 

It is questionable whether or not the theorem holds also for the sets 


M', M% .... 

Before considering: under what circumstances the sets M‘ and N‘ are 
identical, we shall introduce a new definition. Let a be a connected 
region contained in the open set S — N'. As we have seen, the points 
of o tend asymptotically toward N' on indefinite iteration. Nowit 
may happen that their totality tends toward N 1 uniformly on iteration 
of T(or T_, V By this we mean that given arbitrary positive £, there 
exists a positive integer K such that each point of each a,.(or a_ 4 ), 
/• > K, is within a distance £ from the closed set N'. In such a case 
we shall call o an <o- (or a-) regular region , and its points to- (or a-) 
regular points. Points of S — N* which are contained in no such 
region will be called <o- (or a-) irregular. 


Theorem. — Points of M' — N', if any exists are a- and ^-irregular. 
Proof. Let P be such a point. An arbitrarily small neighborhood 
o of P is intersected by at least one of its images under powers of T. 
Consequently c contains a point Q, an image of which, say Q/., k o, 
is also in a. Let us choose a point pair such as QQ* for each one of 
a sequence of regions s', o 7 , .... closing down on P. Let these pairs be 

(2> Q', Q«, Qi. 


There can not he a finite upper bound for the sequence of positive 
integers m ty m. Jt .... For if N were such a bound, infinitely many 
of the integers of the sequence arc equal to some integer nt 9 o < /w<N, 
and from I we could extract a sequence of the form 


<>% . 

But since the sequences Q*\ ()•*, ..., and Q£, ..., both 

converge to P, it is clear that P„, coincides with P. Thus P is 
periodic, and hence belongs to V, contrary to hypothesis. 

It follows that there can be extracted from - a sequence of the form 


oj ; , 


V. 



with o<6|<A a <.... ICach region a' contains all the points of 
this sequence from a certain rank oh and hence it is clear that no & 
could possibly tend uniformly toward N 1 on iteration to T. Hence P 


374 



GKORGF D. B1RKHOFF AND PAUf. A. SMITH. 


36o 

is to- irregular. By an entirely similar argument, I* is shown to he 

irregular. This completes the proof. 

We shall have occasion later to refer to the following theorem. 

Thf.orem /». —// a connected region p contains non-wandering points , 
it is intersected by in finitely many o f Us images under powers o f T as 
well as T 

Proof. Lei P he a non-wandering point in p. Referring to the 
proof of the preceding theorem, let O', ()' m , Q’ g , . |>c a 

sequence of the type (£) and converging to P. Vs we have shown, 
cither there exists no finite upper hound for the sequence m ,, m 99 . . ., 
or else P is periodic. Either situation leads lo the staled conclusion. 

4. Invariant integrals. — We have seen that an invariant set Eover 
which there can he defined an invariant integral of a certain type 
must necessarily consist of central motions. It is probably not true, 
however, that conversely, an invariant integral may always he defined 
over an invariant subset of M\ Suppose, for example, that M' is 
identical with S. Then if -7 is any connected region on S, the 
regions a, a,, a,, . .., can not he mutually exclusive. But there is no 
apparent reason why the regions of some infinite subsequence 
(7 : i, . .., should not he mutually exclusive, — a situation which could 
not arise if T were conservative. Indeed, there can not he any purely 
topological condition for a metrical phenomenon such as conser¬ 
vatism. It will he worth while, however, to examine am available 
condition which will shed light on the structure of T. 

Consider the region < 7 . In general there will he some image of -7 
whose area is smaller than that of < 7 . Hence on dividing S into a 
number of regions and choosing the proper image of each, S becomes 
compressed, in a sense, into an area smaller than its total surface area. 
We shall show that a necessary and sufficient condition that there 
exist invariant integrals of a certain type on S, or part of S, in that S 
be not compressible into an arbitrarily small area. This is an intuitive 
statement of the results of this section. 

We shall assume now that T is analytic, and shall begin by 
introducing of function o(e) t e being any measurable set on S, 


375 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


36l 


defined as follows : let e be divided into a finite number of mutually 
exclusive measurable sets o', 

€ = ^ o'. 6' 6' = o for i p£ /. 

i 

Then o(c) is the lower bound of the sum 

i 

with respect to all possible methods of subdivision of c into finite 
numbers of measurable sets, and all possible choices of the integers n,. 
Here we make use of the fact that the property of measurability is 
preserved under analytic transformations. 

It is clear that function 9 may be identically zero, in which case, 
S is « compressible into an arbitrarily small area ». This happens, for 
example, when T is an analytic transformation of a sphere such that 
each circle parallel to the equator closes down on the north (or south) 
pole on indefinite iteration of T (or T_ ( ). On the other hand, for a 
transformation which preserves areas, we have 9 (r) = m(r). 

In any case, it follows immediately from the definition, that 

o (*)</#<(?) 

and hence 9 (c) is bounded, and totally continuous on S. 

The importance of 9 for our purposes is due to the following 
theorem : 

?('■') is a completely additive function of measurable sets and is 
invariant under T. 

Proof. We first prove the invariance of 9 . Let e he an arbitrary 
measurable set and suppose that 9 (c) < z(e, ). By a proper subdivi¬ 
sion of e into a finite number of measurable sets, together with a 
proper choice of the corresponding integers //, we obtain a sura 

< 

Journ de Math., tome VII. — Fasc. IV, 1918. 


376 



36a GEOHGE D. BIRKHOFF AND PAUL A. SMITH. 

which approximates o(e) to any desired degree of closeness. In 
particular, we may assume by virtue of the assumption <p(e)<C o(e x ) 
that 


Now^/w(6'is an approximating sum for p(r,). Since its value 
/ 

is^]m(oi ( ), the inequality above contradicts the fact that <p(c,) is the 

lower bound of its approximating sums. Hence ©(<*) can not be 
smaller than <p(e,). By interchanging c andc,, the same argument 
shows that <p(e)can not be greater than ?(e t ). Hence ?(e) = <p(r ,). 

Next we wish to prove that if r and f are measurable sets without 
common points, 

9(r+/) = 9(,f)-4. ? (/). 

Let us choose approximating sums 

2m«), 2 m(y' et ) 

for /, and e -+-/ respectively. Regardless of how the first two 
sums are chosen, we can always choose the third such that 

hence it follows that 

. 9 <* +f)$ ?<«) + ?(/). 

Now suppose that 

9(/) 

# 

Assuming, aswemay, that Sm(y‘ ri ) approximates ?(e -+-/) sufficiently 
closely, it follows that 

(l) 2'»(tf,X?(e)-t-9(/). 

New approximating sums for ? (e) and <?(/) arc furnished hy 

. -"‘(ey'c) and 2m(/y},); 

and since 


y = ey-^/y. 


377 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


363 


it follows that 

(*) 2m (•/,) = 2m(/%) -f- 2m (/•/,). 

Moreover, ©(e) and ^(/) being lower bounds of their respective 
approximating sums, we have 

2 m(cy! t ) l 9 (er), 2 m(/*/£,) £ 9 (/). 

Combining these relations with (1), we obtain 

lm(ey' ei ) - 4 - 2m(/y'.) > 2m(y',) 

which contradicts the equality (2). Hence 


9(c-+-/)^o(«r)-4-9(/). 


We now seek to define the circumstances under which the following 
situation will arise : 

E is an invariant measurable set of non-zero measure; F(P) is a 
non-negative measurable function defined over E and possesses a finite 
upper bound M on E. Moreover F(P) may vanish at most on a set 
of measure zero. 

Finally the integral 



(*<G> 


is invariant under T. (It vanishes only when m(e) = o, by the last 
assumption on F). 

? A necessary and sufficient condition for the existence of a set E and 
an associated integralJ' F(P) do is that <p(S) > o. 

We first assume the existence of E and J F(P)<fo and prove 
that ?(S) > o. 

By the assumptions on E and F(P), it follows that J F(P)</<j>o. 

If, now, o(S) = o, it will follow that j* F(P) do = o which is a contra¬ 
diction. To show this, let 2 m(o^) be an approximating sum for 9(E). 
The assumption 9<S) = o implies that 9(E) = 0 and that 2 /w( 8 ;)<e. 


378 



364 

Then 


GEORGE D. BIRKHOFF AND PAUL A. SMITH. 


Now since 



d<r^ M N //) < e M. 



(the integral being invariant) it follows that J is smaller than e VI and 


is therefore equal to zero, wicli is the desired contradiclion. 

That the condition is sufficient follows from the theorem (see 
Carath^odory, loc. cit. V that a bounded totally continuous additive 
function of measurable sets is expressible as the indefinite integral of 
any of its « derivatives ». The function 9 is of this type and we may 
therefore write 



V)ch. 


A derivative is defined as follows : To each point P is associa- 
icd a sequence of neighborhoods X|. of suitable type closing down 
on P and so chosen that as i-*■ x, 


ofW) 


shull converge lo a unique limit. Different derivatives mav result 
from different choices of the regions X;. All derivatives are suin- 
mublc functions, however, and any two of them differ at most on a 
set of measure zero. 


In view of the inequality it follows that cverv den- 

vative of 9 has the upper hound 1. 

The set C. of points for which D(P)>o is measurable and its 
measure is greater than zero, since 


J d t P) >h = i) ( i») <1, = 9 , s , > o. 

Moreover,^ D(P)>o for ever k, on account of the invariance 
of y(e). Hence on G,„ D(P) can vanish only over a set of measure 


379 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


365 

zero. Therefore, on the invariant set E = G -f- G, G_, -4-..., D(P) 
vanishes at most on a set of measure zero. The set E and the func¬ 
tion D(P) taken over E, are precisely the set and associated function 
demanded by the theorem, and the sufficiency of the condition is thus 
established. 

Let o be an arbitrary measurable set of non-zero measure on S. 

A sufficient condition for the existence of an invariant integral 
defined over the whole of S is tho* 

where £ is independent of 6 and k. 


This is a corollary of preceeding theorem. In fact, an easy conse¬ 
quence of the condition (A) is that 9(c) vanishes only when m(e)-= o. 

We shall now add a few remarks concerning linear dependence. 
Let us suppose that T admits one or more invariant integrals of the 
t)P e f F(P)<fo, where, to simplify the discussion, the measurable 

1 unction F(P) is assumed to be defined over the whole of S and is 
non-negative thruout, vanishing at most on a set of measure zero; 
Moreover F(P) will possess a finite upper bound on S. All invariant 
integrals in this discussion will be of the same type. 

The invariant integrals J F'(P) da, . ,J'F"(P)da are linearly 

dependent if there exist constants A', A" not all zero, such that 
the function 

A'F»A"F a 


vanishes « almost everywhere » on S. If no such constants exist, the 
integrals arc linearly independent. We shall see how the structure 
of T is influenced by the existence of several linearly independent 
integrals. 

A transformation will be called metrically transitive if there exists 
no measurable invariant set E such that o < m(E) < m(S). A 
transformation of this type is also transitive in the ordinary sense; 
that is, for any two mutually exclusive connected regions a and (3, 
some power of T can be chosen which will carry points of a into points 


380 



366 GEORGE D. BIRKHOPF AND PAUL A. SMITH. 

of p. H for some a and ,3 this were not the case, then E being the 
invariant set 

.. , 

we would have o </w(E)<m(S — ( 3 ) < m(S). 

A necessary and sufficient condition that no two invariant integrals 
on S be linearly independent is that T be metrically transitive. 

Tlie condition is necessary. For suppose that every invariant 

integral depends linearly on the integral f F(P )<h. Now if there 

existed an invariant set li will, o < /*( li} < ,„( S), I he invariant 

in tegral f' G(P)*/<r, where 

C(P)sF(P) on F. 

= '*F(P) on S— F. 

is linearly independent of f F(Pwhich is impossible. 

To prove that the condition is sufficient we shall show that if the 
invariant integrals 

\ ( r)=J % F(P)<l<T ami .!(/*)= / G(P)d<r 

* * r 

are linearly independent, T can not he metrically transitive. 

Consider the derivatives 


» I — lin. J ( > 


l ?/ (P) = lim 


• //*(>.{.) 


• > ■ m ( A{> I 


where, as previously, XJ. denotes a sequence of neighborhoods of 
suitable type closing down on I*. For every P, the sequence X;. is 
assumed to he so chosen that the limits written above, as well as 

li... IOIM 

shall exist. —"MTai.,1 •>-».,T ( xni 

The functions F' and G' are equal almost everywhere to F and G 
respectively. Hence the measurable function V'(P), where 


q '(P) = Jl= ,i fll { i } i\ 
GkP) ,*.J i}L) 


381 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 367 

is well defined and finite on S, except at most on a set of measure 
zero, for the points of which we shall arbitrarily assign the value zero. 
We shall show that M“'(P) is « almost invariant», — i. e. that 

V'(P) = V'(P,) 

except possibly for a set of points P of measure zero. For, the deri¬ 
vatives 

r(Ap)] 

are equal almost everywhere to F'(P,) and G'(P,)respectively. Hence 
the function 

* <l ' ) -C=TPT)-JtljiTOt)] 


is equal to ^F'(P,) almost everywhere. But since 

I[T(>*)] = I(>.f.), J[T(U)] = J(W). 

it follows that V'(P) = ^"(P,) for all points P, which proves our 
assertion. 

Let us denote by (a, b), ( o<a<b) y the set of points for which 
r/<'l* , (P)<A. By the proceeding paragraph, each image 6f (a y b) 
under powers of T or T_, is of same measure as (✓/, b ). For, the 
points P for which V'(I\) ^ V'(P) arc at most of measure zero. 
From this it follows that for every set (a, b) there exists an invariant 
.set of same measure, namely the set 

( a . b) -f- T(<r, b) -+- T_,(< 7 , b )+.... 


Now since V'(P) is non-negative and finite thruout, we have 

m (S) £ m (o, ?) -t-.... 

Hence there exists a positive number B such that o <^m (o, B). But 
for no finite number C can we have m( C, C) = m( S). For then we 
would have F\P) = CG'(P) and hence F(P) = CG(P) almost 
everywhere, which contradicts the assumption of linear independence 
of I and J. It follows that there exists a positive number D such 
that o < ///(o, D)<^/7i(S). Since there exists a measurable invariant 


382 



GEORGE D. ni UK HOFF AND PAUL A. SMITH. 


368 

set of measure equal lo //<( o, D), T can nol be metricalh transitive. 
This coinpleles I he proof. 

Do there actually exist transformations which are metricallx 
transitive? A simple example is the transformation of a torus given 
in terms of angular coordination b\ 

T: v, = v + //, 0,-0 fA. 

The constants h and k are incommensurable with 2 - and with each 

other. T moreover admits the invariant integral Jj'dodd) which is 

reducible to an integral of the type considered. The proof that T 
has the properl\ in question offers no difficulty. 


5 Regular region t. — Kelurning to the definition of «- regular 
points (§ 3), we sec that if any exist, their totality constitutes a set 
of inner points, and falls therefore into a set of maximal connected 
regions or components. Each point on the boundary of a component 
either belongs to N 1 or is co-irregular. It follows also from the defini¬ 
tion that the transform of an co-irregular point is again co-irregular. 
This holds also for points of N’, and hence the transform of a com¬ 
ponent is again a component. Moreover, ,/ component C is either 
wandering or else periodic or invariant , for if C intersects C*, then C 
and C,. must be identical. 


Theorem. — A component of a,- (or «-) regular points 
ply or doubly connected. 


is either sim - 


In carrying out the proof, we shall take S to he of genus zero, altho 
the theorem holds for any genus (•). 

Let C be a component of < 0 - regular points (essentially the same 
aigument will hold for an a- regular component) and Y its boundary. 
There is on S at least one invariant point O (’), and we shall consider 
.t as the « point at », We can then distinguish between the interior 
and exterior of a simple closed curve drawn on S and not passing 


(«) A proof of this will appear elsewhere. 
(*) First proved by Brouwer (I). 


383 



STRUCTURE ANALYSIS OK SURFACE TRANSFORMATIONS. 3G9 

lliru O, — in particular, of any simple closed curve contained in C 
or any image of C. 

We shall consider separately the two cases, (I) C is wandering, 
and (II) C is periodic or invariant. 

I. We shall show in this case that there can not be drawn in C a 
simple closed curve enclosing points of y, from which it will follow 
that C is simplx connected. 

Let a he a simple closed curve in C, and A the region interior to a. 
Let k be the smallest non-negative integer, if any exists, for which A* 
intersects A. Since a lies in a wandering region, no two of its 
images can intersect, and hence A is expanding or contracting 
under T A ; we shall assume the former, the argument being quite 
similar for both cases. Thus A* contains A, but has no points in 
common with A,, A a , . The limit region 

I) r: A + A*+ A,» - 4 - ... 

is simplx connected and invariant under T*. It is clear moreover 
that the k regions 


0;^= A, 




k ?*r • 


(/= O. I 


A-,, 


are each of same type as D, and arc mutually exclusive. 

Now consider in D the ring r= (oca*), i. e. the region bounded by 
a and a*. The images /*,, r a , ..., r 4 _, are contained respectively 
in D,, D 3 , ..., D,...,, while r k is adjacent to r. /•*„, to r„ . .., etc. 
Clearly ris wandering, and henCe contains no points of N\ Moreover, 
the points of /• are co-regular. For since the area of r„ converges to 
zero with i//i, the points of /•„ lend asymptotically and uniformly 
toward a„. which in turn, since a is in C, tends uniformly toward N*. 

It follows that r contains no points of y, nor of any image of y. 
Hence /• lies in C, because a does, and also in C k because a* does. But 
this is impossible since Q can not intersect C. We conclude there¬ 
fore, that no integer k with the stated property exists, and A must 
accordingly be wandering. But then the area of A„ converges to 
zero with 1 / n and hence A tends uniformly to N‘, since its boundary 
does. Therefore A contains no co-irregular points and hence no points 

Journ. de Math., tome VII. — Fasc. IV. 1918. 4 7 


384 




GEORGE D. BIRKHOFF AND PAUL A. SMITH. 


370 

of y. Since a is an arbitrary closed curve in C, it follows that 
C is simply connected. 

II. We shall assume for simplicity that C is invariant, — the 
argument for the periodic case being essentially the same. 

Let us suppose that there can be drawn two non-intersecting simple 
closed curves a and (3 whose interior regions A and B both contain 
points of y, but have no points in common. We shall show that this 
situation is impossible, thus proving that C is at most doubly 
connected. 

The closed set a n tends uniformly toward N' as n increases indefi¬ 
nitely. Hence for a sufficiently large positive integer /*, a„ fails to 
intersect a when n>k. Then by theorem 4, § 3, A contains no non¬ 
wandering points and hence no points of N*. Moreover, no two 
regions of the sequence. 

(2) A, A*. A,*. ... 

can intersect. Hence by the same argument used above, the regions 
A jk ^nd uniformly toward N' as / increases without limit. This 
situation holds for each one of the sequences 

A/, A *+/, A f * w . ... (/ = o, 1. k — 1) 

since each is of same type as Z. The totality of these sequences 
includes all the regions A, A,, A a , ... which makes it clear that 
A consists of co-regular points. Since, as we have shown, A contains 
no points of N', it follows that A contains no points of y, contrary to 
the choice of A. Thus the assumption that A„(/*>*) fails to inter¬ 
sects a, it follows that A is contracting or expanding dnder some T,„, 
m = k , and hence contains a point U invariant under T m (Brouwer, I). 
By precisely the same reasoning, there is a point V in B, invariant 
under T,„,, rri>k. 

Let us join a and ? by a simple arc t in C. By tracing a contour 
about the set a + p-l-T and sufficiently close thereto, we obtain a 
simple closed curve 0 in C, whose interior region D contains no points 
of y other than those in A and B. The set D — (A -j- B -f- a -+- 3) 
consists entirely of co-regular points. ' 

We may of course apply the same reasoning as above to D. Hence 


385 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 371 

if s is a sufficiently large multiple of W, the transformation T, leaves 
invariant the points U and V and admits D as an expanding or 
contracting region. 

Consider now the ring r hounded by o and T,(o). The rings T,(r) 
and T_j(r) are adjacent to /*, one within and one without. The limit 
region R consisting of r together with all its images (boundaries 
included) under positive and negative powers of T,, is a doubly 
connected region, invariant under T,. Let I and K be its inner and 
outer boundaries. 

We now examine separately the two possibilities as to D. 

(a) /) is expanding under T,. In this case I contains no w-regular 
points. For any neighborhood z of a point P on I contains points in 
the limit ring R, and these points are carried by iteration of T, 
toward E, whereas all images of P remain on I. Hence from a cer¬ 
tain rank on, each member of the sequence 

p. p*. pt<. 

must intersect o, which is at a non-zero distance from N'. This makes 
it clear that P can not he co-regular under T,; nor then, under T. 

Now I is contained in D. Rut since I contains no co-regular points 
rl can not intersect the set D — (A + D + a + p). Hence I is con¬ 
tained in A -f- R and is at a 11011-zero distance from a -f- t 3 . Rut this 
is impossible since every approximating curve T,„(o)(/»= 1,2, ...) 
encloses the points U and V, and therefore contains points not in A 
or B. This contradiction excludes the possibility that D be expan¬ 
ding under T,. 

(b) i) is contracting under T,. Whatever points of N' there may be 
in D lie in A -h R. Hence the curve 0, which tends toward N' uni¬ 
formly on iteration of T, must eventually be contained in A or B. 
Rut this is impossible, since each image of 0 encloses the points U and V. 

The contradiction in this final possibility shows that there can not 
exist curves x and 3 with the stated properties, which completes the 
proof. 

On a surface of genus zero, a regular component of wandering type 
is simph connected, as follows from the proof of the theorem. It 


386 



GEORGE D. BIRKHOFF AND PAUL A. SMITH. 


372 

can be shown that this result holds for all surfaces of genus different 
from 1. For a torus, however, as we shall show later by an example, 
the regular components of wandering type may be doubly 
connected. 

In a regular component C there are no invariant or periodic points. 
Hence if C is of invariant simply connected type, it follows from a 
theorem of Brower (III) that within C, T is topologically equivalent 
to a translation. 

If C is invariant and doubly connected, its boundary consist of two 
continua, at least if S is ol genus zero. Moreover, from the proof of 
the preceding theorem, each point of C tends asymptotically towards 
one and the same of these continua, on indefinite iteration of Tor T. 
according as C is <u-or a-regular. 

Let W be the set of all w-irregular points and A the set of all a-irre- 
gular points. 

T iieorem 5 . — On a surface oj genus zero y each point of W (or A) 
is connected to N* thru W (or A). 

For otherwise it would be possible to draw a simple closed curve p 
enclosing points of W (A) but not of N*. Thus 0 must lie entirely in 
some <o-(a-) regular component C. Since boundary points of C are 
enclosed by p, C must be doubly connected and therefore of periodic 
type. Hence the inner boundary of C, being a closed periodic set, 
must contain points of N', which is impossible. This establishes the 
theorem. 

We shall now consider some simple examples displaying various 
types of regular components. 

The first is a transformation of a sphere illustrated schematically 
in fig. 2. Here we have taken the plane with a single point at oo for 
our representation of S. The points B and oc are the only invariant 
points, and the set N' contains only these points. The motion of tho 
remaining points is indicated by the arrows. Clearly the points on 
the arc BAoo (excepting B and oc) are to- irregular while all others 
are to- regular. Similarly the points on BCco are a- irregular and all 
others are ct-regular. (We see here that irregular points need not be 
both a- and to- irregular.) Hence the region whose boundary consists 


387 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


373 

of BAcc(BCcc) is an <0- (a-) component of simply connected invariant 
type. Finally, the example shows how each irregular point is 
connected to N' thru a continuum of irregular points of same type 
(theorem 5 ). 

Let us next consider an example in which the regular components 
are of wandering type. It is easily shown (See Poincare, I) that if a 
sense-preserving transformation of a circle C into itself admits no 
invariant or periodic points, it must be of one of two types. In the 
first type, iN' coincides with C, while in the second, N‘ is a perfect 
nowhere dense set on C. Let t be a transformation of the second 
type. The set C— lV consists of a denumerable infinity of wander- 


Fif. a. 



ing open arcs; suppose one of these Is 0. Since the end-points of o 
and of each image of o arc in N\ and since the length of 0* converges 
to zero with 1/ /*, it follows that S is an a- and a>- regular « component » 
of w andering type. 

Suppose now that S is a sphere, and that Cis a great circle, under¬ 
going the transformation t. We may extend t to thc\Vhole of S by 
letting each circle parallel to C undergo the corresponding congruent 
transformation. It is clear that corresponding to the wandering arcs 
of C, we now have wandering simply connect regions, and they are 
a- and co- regular components of S. 

Suppose finally that S is a torus with angular coordinates G and <p. 
By letting the circles 9 = const, undergo congruent transformations 
of the same type as /, we obtain a transformation of S in which the 
regular components arc wandering rings bounded by circles of the 


388 


374 GEORGE D. BIRKHOFF AND PAUL A. SMITH. 

family 0 = const. As \vc have already pointed out, regular compo¬ 
nents of this type can only exist on surfaces of genus i. 


b. The general analytic case. — Transformations of the general 
analytic case (§ 1 ) arc free from certain structural complexities and 
therefore seem best suited for study, in an attempt at systematic 
structure analysis. In this paper, we can only make a beginning, 
and must moreover, limit ourselves to the simplest case, — that in 
which the central motions are finite in number. 

Theorem 6. — In the general analytic case, there must exist at least 
tw o central motions. 


Proof. There exists in any case at least one central motion M. 
We shall show that in the general analytic case there must exist 
further central motions. 

If the complete sequence M is not pseudo-recurrent, then there are 
further central motions by theorem 2, §2. Hence we may suppose M 
to be pseudo-recurrent. If M contains infinitely many points, its 
complete group would be a perfect set and would therefore contain 
central motions other than M, since M is at most denumerable. Hence 
we may suppose M to consist of a single periodic group. For simpli¬ 
city let us suppose that M contains but a single point. There is no 
difficulty in extending the remainder of our argument to the more 
general situation. 


If the invariant point M is of stable type, it is contained in a small 
expanding or contracting region o. Hence the closed set S — c is 
transformed into part of itself by T or T_„ and therefore contains a 
closed invariant subset which must contain further central molionsi82). 
I bus we may suppose M to be of unstable type. More explicitly, 
»c may suppose M to be of directly unstable type, for if inversely 
unstable, we may replace T by T, in the remainder of the argument, 
making use of theorem .3, § 5. 

We shall show that there exist central points other than the 
directly unstable point M. Consider the sequence N', N 2 N'=M 

ILV w? S / P xT f °7 he m ° ment lhat '>*■ We may’ assume that 
each N (*<*) has the property that from it there can be extracted 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


3 7 5 

an infinite sequence of poinls converging lo M. For if this were not 
true for N', sav, N 1 — M would he a closed (or finite) invariant set 
and would contain central motions other than M. Lei/>',/>% ..., 
he a sequence of this type extracted from IV. 

Near M choose four poinls A, B, C, D on the four invariant branches 
respectively abutting at M. Let E be the closed set consisting of the 
four arcs A A,, BB,, . . ., of the four invariant branches. We assert 
that E intersects each N', *>x. For suppose a point Q is very close 
lo M. If Q is on one of the invariant branches, some image of Q will 
certainly fall on E. In the contrary case, Q will move along close to 
an a- (to-*) branch on repealed iteration of T_« (T), (see § 1 ). 
Hence some image of Q will fall very close to E. From this it follows 
that out of the set I*', P 2 ; ..., and its images, we can extract an 
infinite sequence of the form. 

Bi.. 1%. 

converging lo a point L on E. The sequence is contained in N', and 
hence so is L, since IV is closed. This proves our assertion for N‘, 
and the same reasoning applies for each N*(« < j). 

Suppose that the sequence 1,2, . . ., to, .. ., of ordinals less than s 
possesses no last element. Their IV consists of those poinls which are 
common to all the sets N',N 9 ,.. .,N',.. .,(i<x). But EN‘>EN 2 >..., 
and hence there is a closed set of points common to the closed 
sets EI\\i<j). This set is of course EN* and hence iV contains 
points other than M; that is, there are central motions other than M. 

There remains only the case in which the sequence of ordinals less 
than s possesses a last element s — 1. If Q is a point of N J (or a point 
of S different from M, in case s= 1) the sequence Q, Q,. 
and Q, Q_,, .. ., both converge to M. If we recall (§ 1 ) that points 
in a small neighborhood g of M and not lying on any invariant branch 
of M arc carried out of g on repeated iteration of T or T_,, it becomes 
clear that Q is doubly asymptotic to M in the sense of § 1 , and is in 
fact a homoclinic point. Hence by the theorem of § 1 , there exist 
central motions other than M. This completes the proof. 

Let us suppose that S is a sphere, and examine the structure of T. 
in the case when there are exactly two central motions. 


390 





3 j 6 GEORGE D. B 1 RKHOFF AND PAUL A. SMITH. 

To every invariant point of S there is associated a number i called its 
index, and in the general analytic case i may only take the value -4- i 
or — • More explicitly, the directly unstable points are of index —i 
and all others of index -f- i. On a sphere, the sum of the indices of the 
invariant points ( ') is always 2 . (With regard lo the statements 
above, see Birkhoff If). Hence, in the general analytic case 
there are at least l > invariant points. But in I lie case under 
consideration there are exactly two, say V and Q, since each is a 
central motion. There arc no further periodic points of any order. 
The index of each point is 1 ; hence each is of stable or inversely 
unstable type. If one or both were of the latter type, the sum of the 
indices of the invariant points of T, would be < 2 which is impossible; 
hence both are of stable type. 

About P may be drawn a small circle enclosing a region expanding 
under T or T_,, — suppose under T. The boundary £ of the simply 
connected limit region 


must contain Q, for otherwise, being a closed invariant set, it would 
contain central motions other than P and Q. 

If 2 contains points other than Q, let p he a small expanding (or 
contracting) region containing Q. The closed set 2 -1, is carried 
into a part of itself by T (or T_.) and therefore contains central 
mol ion; other than P and Q, which is impossible. Hence 2 = Q. 

Thm S >>y " system of concentric analytic closed curves into 

a system of adjacent rings r„ r_„ r „ .... The rings r, 

and r r ,, .... close do,on respectively on the two invariant 
points, and each /•„ is carried by T into the adjacent ring . 

In the general analytic case in which the central motions are linile 
m number,. .1 is probable that a complete structure analysis can be 
effected, as wc shall now indicate. 

Let us again lake S be a sphere and assume that the number of 
central mohons is finite and greater than 2 . From the proof of 


Je! A55Umin6 lh ' m ‘° b ' r,nhe iD nU " b ' r - " hich lh «y - lb. general analvuc 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. 


3 77 

theorem 6, it is clear that the central points are finite in number, each 
central motion being either an invariant point or a periodic point 
group. Let k be the least common multiple of the orders of the 
various periodic groups. The central points are invariant under T /t .. 

Suppose that of these invariant points, p are stable, m directly 
unstable and n inversely unstable, and let p\ rri , ri represent the 
corresponding numbers relative to T s *. Then referring to the discus¬ 
sion above relative to indices, 

p — m -h n = />'— m'-4- n'= 2 . 

Moreover (§ i) p=p' and m<rri. Hence n<ri. But under T a * 
there are no points of inversely unstable type; hence n = ri = o, and 
from this we have pk 2 . Let the totality of stable points be Q‘, . . 

Q", (Pi*)- 

Containing each Q', there is a small region ri expanding under T* 
or T_*. Each Q* is therefore contained in a simply connected limit 
region A', 

A' = <x -f- T e *(<r) -4- T lt *(a) 

where e is -f- 1 or— 1 according as s is expanding under T* or T_*. 
Each A' is invariant under T*. 

Let the boundary of A* be denoted by a'(i = 1 , 2 , . . ., p ). It is 
clear that if a', sav, consists of only one point, that point together 
with Q are the only central motions of T* and hence of T (theorem 3, 
§ ^)> whereas we are assuming that the number of central motions is 
greater than 2 . Hence each a' is a closed periodic continuum and 
hence contains central points. 

Each a' contains at least one periodic point of directly unstable type. 

Proof. Consider a', and suppose o‘ is expanding under T*. Let 
us suppose moreover that a' contains no points of directly unstable 
type. Then the central points which do lie on a' are of stable 
type. Suppose one of them to be Q*. Then a a must be contracting 
under T*. For if a 7 were expanding, the points of a 2 would tend 
toward Q a on repeated iteration of T_ A ; but ri 1 contains points of A', 
and those points must tend toward Q' on repeated iteration of T_*, 
wich would be impossible. 

/oum. de Mat! i., tome VII. — Fasc. IV, 1938. 48 


392 



378 GEORGE D. BIRKHOFF AND PAUL A. SMITH. 

Since then, o’ is contracting under T, and therefore expanding 
under T_*, the closed set a' _ a' a 3 is transformed by Tl* into a part 
of itself; hence it contains a closed subset invariant under T, and 
therefore contains points of stable type other than Q 1 and Q 3 . Suppose 
that one of these is Q 3 . Then the same argument used for a 3 , shows 
that o 3 is contracting under T*. Hence «• —«• a 3 -a'a 3 contains a 
further stable points, say Q\ Continuing thus, the set of stable 
points is eventually exhausted. When this stage is reached, one 
more application of process must yield a point of directly unstable 
type on a'. This completes the proof. 

Suppose that the orderof the region A • ism. Then the regions A',..., 
A"' 1 are of same type as A' and together with A' form a periodic set 
of mutually exclusive regions. Similarly, each A- belongs to a 
periodic set of this sort. Two of these sets, however, may overlap. 

One method of proceedure would be to study the properties of a 
maximal set M of periodic sets, all the regions of which are mutually 
exclusive. Within M, the structure of T is known. On the boun¬ 
dary of M are a number of directly unstable periodic points, and it 
can be shown that certain of the a - and a, - branches of each of 
these points must also be contained in M. Finally, the remainder 
of S falls into a set of connected regions of periodic or wandering 
type, and these in turn, break up into regular components of various 
types. It is hoped that a complete analysis of this case, as well as 
more complex cases will soon be accomplished. 


References. 

B 7"°",.7 '• , Q “ el<l "* S ,l '™" !url * m ou,ement dessy.len.es dyn.mi.p.e, 
„ ' d Socteti mathimalique dc France, vol. 40, 1 q. ?) 

d, '"° <leS,e " of < Transactions of „,e 

A mertcan Mathematical Society, vol. 18 191 7 ) 

nZ:^ c ;r^r ions - and ,heir d>nam/icai " pp,ica,ions < ^ 

AacJc^n, fqloT 6 Zen "" ,lbe "' 8 " ngen d - Vnami “ her <Omin g er 

vo| V S0 On , 9 t e ». PenOdiC m0,i0nS ° f ,I - Vnan,iral *J s,em * (Acta mathematica. 


393 



STRUCTURE ANALYSIS OF SURFACE TRANSFORMATIONS. J79 

Brouwbr. — I. Continuous one-one transformations of surfaces in themselves 
(Proceedings of the Section of Sciences, Koninklijke Academic van Wetens- 
chappen te Amsterdam , vol. 11 - 15 , 1908-1912). 

II. Beweis des ebenen Translalionssalzes (Mathematische Annalen , Bd 72 , 
19!2). 

LattGs. — Sur les equations fonclionnelles qui definissent une courbe ou une 
surface invarianle par une transformation ( Annali di Matematica , 3 * s6rie, 
vol. 13 , 1907). 

PoincarP.. — I. Sur les courbes definies par les equations diflerenlielles (Journal 
dcMathematiques, 3 f serie, vol. 7 - 8 , 1881-1882, et 4 * serie, vol. 1 - 2 , 1 885 -1886). 

II. Melhodes nouvelles de la Mecanique celeste, vol. 3 . 


394 



INSTITUT DE FRANCE 


ACADEMIE DES SCIENCES. 


(Extrait des Complex rendus des seances de CAcademic des Sciences. 
t. 192 . p. 196, seance du 26 janvier 1931.) 


gEomEtrie. — One generalisation d n dimensions du dernier thiorbme 
de giomitrie de Poincare. Note ( ') de M. Gbohcb D. B.bbhopk. 


Soil T une transformation, continue, biunivoque et directe dW 
region R en une region T(R), R diam simplement connexe et entourant le 
point invariant O. Supposons de plus que T est conservative , c'est-4-dire 
conserve les a.res. Alors le thiorcme de Poinc.rf(»), ,el que je l’ai dtfjft 
Itendu ( ), peut 4tre enoncc de la mani&re suivante : 

Si l, ‘ " ans f ormt “‘on T fait tourner tous les points de la front,ere de R dans 
un mime sens autour de 0,et tous les points suffisammentvoisins de O dans le 
co " tra ‘ r ', if cxistera au mo,ns deux aulrespoints invariants. 

Rvidemment ce theoreme est etroitement lie au suivant : 

S, la transformations ne fait tourner autour dc O aucun point <fune courbe 

'issue dr O aTT* ^ C ° UP J e ^ ^ /OU ^ ChaqU ' U * ne radial ' 
“ * o, if extstera au mo,ns deux points invariants sur cette courbe C 

I'imaJ l< “°" Slra,,0n de . ce dernier theoreme est immediate. En effet 

dZc .7’ n ‘ COnt ? ir C danS S ° n inl " ieur > ni «tre contenue 

C, 4 cause de 1 invariance des a.res. Done T(C) doit couper au moins 
* • 

(*) Stance du 5 janvier 1931. 


395 


( 2 ) 

deux fois la courb<> C, el les points commons seront necessairement des 
points invariants. 

Noire bul ici esl d'indiquer comment on pent g£neraliser a n = -im 
dimensions ce th^orcme en apparence presque banal, et en donner l’appli- 
cation a la Dynamique. 

Nous consid^rons une transformation 1' : 

l ■> .r,. y m i, v , 1 = *1,1 .r,, ..., r m i (* = i//i i. 

on les i, sont reelles, analyliques et looles nulles a l'origine 

j , = = . . .= y m = o 

(point invariant de T) tandis que leur determinant fonctionnel n’est pas 
nul en cc point. Nous appelons une telle transformation conservative si Ton 
pent ecrire idenliqiicment 

m 

' ■*» ,n 1 ..»■«• ‘« 2 [t •*/’ *0V — r' ,lr i l ) -- (•*< dy t - dr,) |, 

In i 

on I est une fonction analytique de x,, x a , . .., v,„. La signification de 
cettc relation connue est que la somme des aires incluses par m courbes 
situees dans les m plans de x t y, esl conserve par la transformation T. 

Supposons maintenant qu’il existe une surface Cam dimensions ayant 
les proprietes suivantes : 

i** Les equations de C sont de la forme 

.(/= i.mi 

oil /*„ 0, sont les coordonnees polaires qui correspondent a x ( , y, et les/, des 
functions analyliques, positives et periodiques de periode 2 iten 0,, . . ., 0„,. 

La transformation T ne fail « tourner » aulour de l'origine O aucun 
point de C, c’est-a-din- qu’on a toujours 0/ = 0,(i = i, . . rn) pour un 
tel point. 

Selon le theoreme que nous voulons deinontrer, #7 caistera au moins 
■2’" points invariants de la transformation conservative T a im dimensions sur 
une telle surface C a rn dimensions. 

Kn ell’et la relation ( 2 ) peul s’ecrire 

//l = Vir/ rJdO,). 


396 






< 3 1 

Mais sur la surface ( i, on a c/0,. 1 = c/0, (pour i = i, .. ., ,|onc 


c/ , =2(r,'*_rf)c/4 ( . 

D autre part, la fonclion I aura sur cette surface au mnins «“ points 
auxqu. ls c/I =o, ce qui implique r/ =r, (pour /=!,...,/«). Tous c. s 
points sonl necessairement invariants. 

I,'application de ce llieoreme a la Dynamique demande unc elude speciale 
des transformations conservatives. Comme je l ai deja demonlre ( loc. cit.), 
celles-ci pen vent se meltre sous une forme normale essentiellement unique 


. </l 


i — | — _ CvP 

l J '* ~ r ‘JiT ' ■>' = -?■ ^ (' = > .//.I. 

< 3 > j r .. .. 

( j "■=■*,}„ K(",.. 

Pour atteindre cette forme normale il faul en general emplover des series 
d.vergentes 4 coefficients imaginaires. Dans le cas d'un mouv^ent p^rio- 
dique stable, les constan.es /., et c ;/ son. des cons.an.es puremenl 
imaginaires. 1 

Kn employant cette forme normale, on pent demonlrer I'existence d'un 
nombre infin, de valeurs de k avec des transformations T‘ correspondan.es 
puissances deT)qu.do.,nent des surfaces C* du type demande par no.re 
theoreme, d on la conclusion principale : 

Si les constant's >„ assorts acre an nuntoemen,periodique stable d'un sys- 
»n,c dynamic dr + , de g rrs ,1c liberie nc sant Uees par aucunc relation 

n w'T ' T Y'"~ " ,P .’ p - ,,es ' nl,ers ’ rt s ‘ le determinant | c„ I 

" f5t . pas n< d " n ">»« de movements periodiaues dans le 

vn is mage immediat du mouvemcnl periodique donne. 

de Hbertl'/zt" 1 *>\ qU \ j aVai * seulemenl demon.re dans le cas de deux deg. es 
de libeite (foe. ci(.) subsiste encore dans le cas general. 


397 



Rcprinird from the Proctrdian of the National. Academy or Sciences, 
VoL 17. No. 12. pp. 05O-0S0 December. 1031. 


PROOF OF A RECURRENCE THEOREM FOR STRONGLY 
TRANSITIVE SYSTEMS 

By George D. Birkhoff 

DlPAtTMINT OF MATHEMATICS. HARVARD UNIVERSITY 
Communicated November 27. 1931 


Let 

~ - X,{* .X.) (j - 1.2. ....») (1) 

be a system of * differential equations of the first order, valid in a closed 
analytic n-dimensional manifold without singularity, M. The points 
of M are taken to be represented by a finite number of such sets of variables 
(x) in overlapping domains. For definiteness, the right-hand members 
X, as well as the transformations of connection between the 9ets (x) are 
taken to be analytic. Finally it will b* assumed that there is a volume 
integral invariant, / dx, dx t .. dx m in *. ‘able coordinates. 


398 




Vol. 17, 1931 


MATHEMATICS: C. D. BIRKHOFF 


651 


If we select any (n — 1) dimensional analytic surface <x in M, which 
cuts the trajectories in one and the same sense throughout at an angle 
0 ^ d > 0, the points of <r whose trajectories cut M infinitely often as 
the time / increases and decreases, fill all of <r save at most a set of measure 
0 in the sense of Lebesgue. This is essentially the significance of the 
classic work of Poincar£ on the recurrence of trajectories. 1 

Now it is probably true that in general such systems (1) are strongly 
transitive in the sense that any measurable set of complete trajectories 
in M has either the measure 0 or V, where V is the total volume of M. 
The fact that such strong transitivity may be realized has been shown in 
a simple example by E. Hopf, who has first defined this type of transitivity. 

We propose in this note to prove the following simple recurrence theorem, 
if t H denotes time of the nth crossing of «r by a trajectory which issues 
from a pomt P of <r, then we have, for a certain constant r, 


lim 


up) 


...» r - < J > 

for all points P save those which belong to a set of measure 0. In other 
words there is a fixed "mean time” of crossing on a general trajectory. 

Very recently von Neumann,* by an application of abstract integral 
equation theory in a direction suggested by Koopman,’ has obtained 
results which would show that t.(P)/n converges in the mean toward r- 
but this does not show convergence nor a mean time in the usual sense.’ 
fc. Hopf' has established his results directly. 

I propose to establish ( 1 ) here, and in the following note to establish 
a general recurrence theorem and thence the "ergodic theorem ” 

The method of proof is one which I tried to use nearly ten years ago in 
order to show that there was some uniformity of recurrence when there 
was merely regional transitivity. That attempt would seem to have 
failed because the hypothesis was not exacting enough. It is to be re¬ 
marked that the demonstration of the strong transitivity condition in 
any except veo' simple cases appears to be extraordinarily difficult. 

. |^W a f n p lnfin ' tesima '" cylinder made up of arcs of trajectories with 
a base da at P in a, and of height dn normal to da. Its volume is then 

dadn or rcos edadt, where v denotes the velocity, and dl denotes the 
corresponding time. 

Suppose now that the tube of trajectories with this base da (t increasing) 

fo U il°dT in 3 firSt , time “ ‘ he baSC ^ A doub| y closed tube !s thus 
formed having a total volume t(P)vcos9da. where t{P) stands for the 

bWf'“Sbelh 66,1 H hC CrOSSing 31 P 3nd 31 P ' Let % increase further 

in it ? ben advances to a new position, differing from the former 
cvlinder f Cy . °^ volume vcosffdadl has been subtracted, and the 

cylinder of volume vcos Bdidt has been added. But these volumes are 


399 



MA THEM A TICS: C. D. BIRKHOFF 


Proc. N. A. S. 


652 

equal, since volumes are conserved. In consequence if we designate the 
analytic point function v cos 0 > 0 by u (P) it is clear that u(P)dtr = 
io(P)dtr; in other words, J*oj(P)do is conserved by the (n — l)-dimensional 
transformation T(P) which takes P in a to P in a. 

According to the result of Poincar£, the transformation from P to P is 
one-to-one in a except at a set of points of measure 0. Of course, t(P) is 
defined except at such points. More precisely, a may be broken up into 
a numerable set of open continua, in which the transformation T(P) and 
the function t(P) are analytic, together with a further set of measure 0. 

If the time to the nth crossing of a be denoted by t n {P) (defined except 
for a set of measure 0), we have the fundamental functional identity 

UP) = /(r—•(/>)) + (/>). (2) 

which states that the time to the nth crossing is the time beyond the 
(n — 1 )th crossing together with the time to the (n — l)th crossing. 
Here T k (P) denotes the &th transformed point of P. By successive use 
of the above identity we derive further 

UP) - t(T m ~\p)) + Kr—W) + ... + t(P). (3) 

From this equation we obtain 

S.UP)dP = fAT'~\P)dP + ... + SXP)dP, (4) 

where dP stands for the (n — l)-dimensional volume element u(P)d<r. 

But J'l(P)dP extended over a is the total volume V of M, according 
to the hypothesis of strong transitivity. For, this integral represents 
the measure of all the trajectories which issue from <r, and the remaining 
measurable set of trajectories is therefore of measure 0 by this hypothesis. 
Moreover, since f dP is conserved by T, and T transforms <r into itself 
except over a set of measure 0, we have 

SAT*(P))dP = S.l/T"- , (P)dP = ... = V. 

Thus (4) gives us 

f.UP)dP _ V _ 
nJldP Jlvcoslda 

In other words the mean time of the nth crossing of a is precisely the ratio 
a of the total volume to the rate of flux across the surface <r. 

Now consider the set S s of points P such that for a definite 8 > 0 and 
for infinitely many values of n, we have 

UP) > n{a + 8) (« = 1.2. n) (6) 

The set S k 6 of points P for which this inequality holds for some n ^ k 
is a measurable set, which diminishes (or at least does not increase) with 


400 



Vol. 17. 1931 


MA THEM A TICS: G. D. BIRKHOFF 


053 


increase of k, toward the limiting measurable set S t . Moreover, the set 
S b has the property of invariance under S:T(S) = 5. Hence S 6 has 
either the measure 0 or that of a , for, according to the hypothesis of 
strong transitivity, the measure of the trajectories through S* is 
fs 6 t{P)dP = 0 or V. 

Similarly the set S\ of points P such that for a definite 5 > 0 and for 
infinitely many values of n, we have 

l H {P) < n(a - 8) (« = 1, 2, ...), (7) 

is an invariant measurable set of measure 0 or a. 

If Ss and Si are both of measure 0 for 8 arbitrarily small, we would 
conclude at once that, for almost all points P of <r, and for n sufficiently 
large, 

n(a - 8) < t H {P) < n (a + 8), 

no matter how small 8 be taken. In this case, of course, the stated theorem 
is true. 

If this is not the case, suppose for example that 5, has the measure of 
a for some « > 0. Certainly in that event, the set 5*. for k = 1 will also 
have the measure of a, since S,., is the set for which (6) holds for some n. 

Now S,. t can be broken up into the sequence of distinct classes U>. 
U 2 , ... of points P defined as follows: 

U t : t(P) > a + 5; 

U<i : h(P) > 2(a -h 6), P not in U\\ 

U 3 : ts{P) > 3(a -f 8), P not in U\ or U t ; 


In consequence, if P is a point of U k we have 

/,(P) > k(a + i) t,(P) S /(<*+«) (1 S / < k) 
whence, by subtraction and use of (3). 


f*_,(r'(/>)) > (* - /)( Q + f y 

We infer that, if P is a point of Uk, then 7 \P) for l < k is a point of one 
of the sets U k t, £/*-,-j, .... £/,. It follows that, as / increases, T\P) 
alls success.ve'y m sets t/*_. . U u w,th lower subscripts, not more 

th ® . * ,. p0,nts be,n K re qu>red before a point of this set falls in U,. 
Thus it is seen that one may separate U k into k - l distinct measurable 

TOH ' r,Y k ~ 1}> Such that ' in U kj . the points T(P), 

TUp\ {P \ K m U ‘r S , W ' th deCreasing subscripts i < k. the last point 
I ( P) only being in U t . 

The measurable sets in U k , ... f £/,, 


U *‘ T <- U ») . T"-\U ki ) (j = i, 2 . k - !), 


(S) 


401 




654 


MA THEM A TICS: C. D. BIRKIIOFF 


Proc. N. A. S. 


are all distinct from one another. In fact, if there were a point P in 
common to 

and T h \U kh ) (ji £ j t '), 

the transformation T“ Jt would give a corresponding point in common to 

U„„ and 

But this is obviously not possible for jt — j\ > 0; and is only possible for 
ji — j\ = 0 if^'i = j t . Hence there are no such common points. 

Next let us consider the measurable part of C/*_, made up of points 
P not in any such set U kj . This part may be likewise separated in measur¬ 
able parts Uk-tj O' - 1. ... k - 2) such that if P lies in then 

7 (P), .... T J (P) fall in sets £/,(*’ < k — 1) with decreasing subscripts, 
the last point T J (P) only being in £/,. 

The sets 

U k - U , T(Uk-tj), •••. 7*“*(£4-u) 0 - 1. k - 2) (9) 

so obtained are again distinct from one another. Furthermore they are 
entirely distinct from the previous sets. For if there were a point P in 
common to 

T h \U kJl ) and P'\U k - u „), 

a transformation by T~\ where j is the lesser of the integers j x , & or 
else is their common value, would give a corresponding point common to 

U kJ , and U k - lJt if j - j x ’ « j t ', 

or common to 

U ti , and if j - ,V 

or common to 

T h '- h \Vv.) and if j = j,’ <j,\ 

The first and second cases are obviously impossible. The last case could 
only arise for.;/ — jt = 1; but this is also impossible since £/*_,„ {jt = 1, 
• • •» k - 1) fa Hs in the part of £/*_, not in T{U k ) and so not in any T{U kjl ). 

Proceeding in the same manner, we may define a set of entirely distinct 
measurable sets 


U k .t. 

Utu 


402 




Vol. 17. 1931 


MATHEMATICS: G. D. BIRKHOFF 


655 


of which the last U l0 consists of the points of £/, not in any preceding set 
These will have the property that the finite set of measurable sets 

T m (U u ) {1 = 2,.. .k,j = 1. 2, .. ./ - 1, m = 0. 1, .../ - 1) 

are entirely distinct from one another and, together with U l0 , exhaust 

•Si.a.* = + Ut + ... + U k . 

Consider now the integral 

‘( p > dp - jE St. ( u u) l(P)dP + f v% t(P)dP 

This integral may be written 

^ Aj + ..+T*-'(U tJ ) t(P)dP 

which is the same as 

E Su,j l,(P) dP. 

j J 

Since U,j is a part of U t , each partial integral in £ exceeds l(a + 6) 
Juij dP, by definition of U,. This quantity is the same as 

(« + *) Su U+ ... + T‘-H' /lj) dP 

since jTdP is conserved by T. Likewise. f Ul% t{P)dP exceeds (a + 6) 
Juio dP. Hence we deduce the inequality 

A.i.k *(P)dP > (a + 6) dP 

i°r k - 2 . • • •• But, inasmuch as S lti has the measure of tr and is the 

limit of for k - 1 , 2, we would then conclude 


f. t{P)dP a (or + S) f. dP, 
which is manifestly impossible. 

Evidently the argumentation just given applies equally well for any 
numerable set of distinct measurable elements of surface, v, which make 
an angle » > d > 0 with the trajectories and have a finite J~dP 
Thus the theorem is proved not only for any single surface but for 
any such measurable aggregate. 

1 Mclhodes nouvelles de la Mecanique Celeste, t. 3. 

* Not yet published. 

3!5-3^(May”' 19 Tl) StemS Transfonnations in Hi,b ^ Space.” these Proceedings 


403 




Let 


PROOF OF THE ERGODIC THEOREM 
By George D. Birkhoff 
Department op Mathematics. Harvard University 
Communicated December 1, 1931 


dt . . . X m ) 


(* = 1, ... n) 


be a system of n differential equations valid on a closed analytic manifold 
M, possessing an invariant volume integral, and otherwise subject to the 
same restrictions as in the preceding note, except that the hypothesis of 
strong transitivity is no longer made. 

We propose to establish first that, without this hypothesis, we have 

U p ) 

~ “ T(P) (!) 


for all points P of the surface <r save for points of a set of measure 0. In 
other words, there is a "mean time r(P), of crossing" of <r for the general 
trajectory. 

The proof of the "crgodic theorem," that there is a time-probability p 
that a point P of a general trajectory lies in a given volume v of M, parallels 
that of the above recurrence theorem, as will be seen. 

The important recent work of von Neumann (not yet published) shows 
only that there is convergence in the mean, so that (1) is riot proved by 
him to hold for any point P, and the time-probability is not established 
in the usual sense for any trajectory. A direct proof of von Neumann s 
results (not yet published) has been obtained by E. Hopf. 

Our treatment will be based upon the following lemma: If 5 x I5x] is 
a measurable set on a, which is invariant under T, except possibly for a 
set of measure 0, and if for any point P of this set 


then 


lim sup tj ^p ^ X > 0 T lim inf tj ^- ^ X > ol 
« - • 71 L * - • n 

fsM p ) dp > * Ssjp iy> x mdp £ x s Sx dp. 


( 2 ) 

(3) 


We consider only the first case, for the proof of the second case is entirely 
similar. In analogy with the preceding note, define the distinct measur¬ 
able sets Ui, Ui, ... on 5 X so that for P in U n 


t n (P) > w(X - 0 (P not in U u U. .£/„_,) (4) 


404 




657 


Vol. 17, 1931 MATHEMATICS: G. D. BIRKHOFF 

The quantity € > 0 is taken arbitrarily. It is, of course, clear that for 
every point P of S x 

UP) > n(\ - «) 

for infinitely many values of n, so that all such points belong to at least 
one of the sets U u U it .... Now, by the argument of the earlier note, 
we infer 

WiP > (X - ,) dp 

where =■ U x + U t + ... + U„. But is. for every value of k,- 
a measurable part of the invariant set and increases toward a limit 
Ui + U, + ... which contains every point of S x . Consequently we 
obtain by a limiting process 

Aj(P)dP g (x - ,) s S) dP 

for any t > 0, whence the inequality of the lemma. 

The recurrence theorem stated results directly from this lemma. 
Consider the measurable invariant set of points P on <r for which 

UP) ^ «X (5) 

for infinitely many values of n (see the preceding note). This is a set 
S x to which the lemma applies. Similarly the set of points P on a for 
which 


UP) <n\ 


( 6 ) 


for infinitely many values of » is a set S'> of the kind specified in the lemma. 

The set 5, diminishes and the set Si increases with «r, and both sets 
taken together exhaust <r. The measure of the set 5. must tend toward 
0 as X increases. Otherwise it would tend toward an invariant measur- 

hSds S for°x P ° A" 6 mC K SUre ', S ’' f ° r Which the ine< l ualit y ‘he lemma 
holds for X - A , an arbitrarily large positive quantity, and we should infer 

fs. t{P)dP g A f s .dP 

for any A. which is absurd. Moreover, when X tends toward 0, S x be¬ 
comes vacuous, since there is a least time of crossing, X 0 . In a similar 

r h a e y ;et: nCrCaSCS Wkh X fr ° m 3 SCt ° f ZCr ° meaSUre f ° r X < X " toward 

If then Sy _ and Si are not essentially complementary parts of <r, one 
decreasing, the other increasing, they must, for certain values of X have 

undeTT meaSUrab ' e C0mp0nent of positive measure, also invariant 


405 



660 


MATHEMATICS: G. D. BIRKHOFF 


Proc. N. A. S. 


where r(P) ^ r(P); while at the same time, of course, 

lim = r(P) > 0. 

n * ® n 

We conclude that the following "ergodic theorem” holds: 

For any dynamical system of type (1) there is a definite "time probability ” 
p that any moving paint, excepting those of a set of measure 0, will lie in a 
region v ; that is, 

lim t = p < l 

•unit exist, where t denotes total elapsed time measured from a fixed point and 

t the elapsed time in v. , . _ , 

For a strongly transitive system p is, of course, the ratio of the volume 

of v to V. . . . f 

Evidently the germ of the above argument is contained in the lemma. 

The abstract character of this lemma is to be observed, for it shows that 
the theorem above will extend at once to function space under suitable 
restrictions. 

It is obvious that t(P) and ?(P) as defined above satisfy functional 
relations of the following type: 

SS bdm(S x ) - Ss x t(P)dP 

where the integral on the left is a Stieltjes integral. m(S x ) being the measure 
of 5 X . 


408 



Reprinted from Atti del Congresso Intern . dei McUemBologna , 3-10 
settembie 1928- vi, 1931, Vol. 5, pp. 5-13. 


G. D. Birkhoff (Cambridge - Mass. - U. S. A.) 


A NEW CRITERION OF STABILITY 


Let . 

(1) %-XAx^Xn) (*'“ 1. ,n) 

be any system of n — 2 u + l equations of the first order, having x t x„ as 
dependent variables and the time t as independent variable. Here X t (/— 1 n), 
are real analytic functions of the real variables x t x„ in the region of x,x„ 
space under consideration. Such a system of equations arises for instance in 
ordinary dynamical problems having /*+1 degrees of freedom, after the energy 
integral has been employed to reduce the order from 2 (/a + 1 ) to n~* 2 u+l. 
Suppose now that a periodic motion is given, 

(2) *i-/i(0 (*-l . n), 

so that the functions f t are periodic functions of t of period 2.7, and the set 
(t —1,...., n), forms a solution of (1). The neighborhood of the corresponding 
closed curve C in n dimensional z,,...., x„ space has the topological Mature of 
an n dimensional torus. If we choose an angular variable r over this torus, 
and n — l suitable geometric variables which we will call z,,...., the equa¬ 
tions (1) take the form 

(3) * —Xfavn ***» *) (t-1 .. 2/<), 

upon elimination of t, where X, stand for the new right-hand members. These 
functions X t are periodic in t of period 2.7. The equations ( 3 ) may be made 
to have the solution z* —0 (i— 1,...., 2/<), corresponding to the given periodic 
motion. 

If we make an arbitrary transformation of variables 

( 4 ) yt H , t) (i-1,...., 2/a), 

in which «p, are arbitrary analytic functions of y *,—, y t , n r, periodic in r of 
period 2 n, all vanishing but with functional determinant J 0 + 0 at y, — .... — — 0 

for- any r, there is readily obtained a new system of the same type ( 3 ) in the 
variables y,,...., y 2 „. Moreover if ( 4 ) be defined by 2 /t divergent power series 
in y lf ...., y iM , in which the coefficients are analytic periodic functions of period 2 . 7 , 


409 



6 


COMUNICAZIONI 


with Jo^O, there exists a uniquely determined set of formal differential equa¬ 
tions (3) in y yt M , in which the righthand members Y t (<— 1,...., 2/x) are 
given by similarly constituted scries. Thus we obtain the formal group of trans¬ 
formations (4) of (1) (*)• 

Suppose now that the solution x,,...., x tpt of (3) which at t-0 takes on the 
values x®,...., x?„ be 


x. - F\i »,...., x®^, t) 


(*-1,.2/z). 


These are also functions analytic in x',’,...., x®, 4 , r, but not in general periodic of 
period 2 n in t. Moreover, these functions vanish for x? — .... — x?,. —0 with J 0 4 = 0 . 
If we put t — 2 n in these equations, we obtain a one-to-one analytic transfor¬ 
mation T from x?, . x®, 4 to xj,...., xi.„ defined by 

(5) T : FAA*-- (i-1,••••, 2/i), 

associated with the given periodic motion. This transformation T has an invariant 
point at the origin in x,x t/ . space with functional determinant not 0 there. 
It is not difficult conversely to set up a system ( 3 ) corresponding to any such 
transformation T, nor to prove that if two dynamical systems lead to essentially 
the same transformation T, then they are equivalent near the corresponding 
periodic motion in the sense that a transformation ( 4 ) carries one system into 
the other (*). 

Thus it is evident that transformations of the general type ( 5 ) from x^x^ 
to xj,...., x\ft in the vicinity of an invariant point may be classified in the same 
way as the motions of a dynamical system near a periodic motion. In particular 
there will enter 2/i multipliers o,....., o, M defined as the roots of the characteristic 
determinantal equation 

where ns usual 1 and d #> — 0 if i*>j. If these roots occur in pure imaginary 

pairs ±y,l -1 (t-1.. fi) with op— for a" sets of integers mm /t 

not all 0 , the transformation T as well as the multipliers, may be said to be 
of « general stable type •. Of course a differential system ( 3 ) and an associated 
transformation (5) are both or neither of general stable type. Likewise « com¬ 
pletely stable - transformations T may be defined in a way entirely analogous 
to complete stable periodic motions. A periodic motion and its associated trans¬ 
formation T will either both be completely stable, or neither will be. 

The case of general stable type in whiqh the given periodic motion is com- 


dO, 

5V 


— adu ! — 0 , 


(') Cf. my book: Dynamical systems (New York, 1927), chapters 3, 4, for various facts 
concerning the formal group, normal forms, complete stability, etc., which I assume as known. 

<*> At least if the right-hand members -V, of the corresponding differential equations (3) 
are required only to be continuous together with their partial derivatives of all orders. 


410 



G. D. Birkhoff: A new criterion of stability 


7 


pletely stable is of great interest. I have demonstrated that a necessary and 
sufficient condition for complete stability is that the given system be reversible 
in a formal sense, i. e. that there exists a transformation 

—t, t) (*'— 1,...., 2/*), 

where the transformation x l —tp { lies in the formal group, for which the new 
system of equations does not differ from the given system except that x 
replace x i} x respectively ('). 

With the aid of this result it is not difficult to establish the following new 
criterion for complete stability: 

A necessary and sufficient condition that a given periodic motion of 
general stable type be completely stable is that the associated transfor¬ 
mation T be the formal product of two transformations t', V of involutoric 

type: T-UV-, U'-V'-I (I, the identity). 

It should be explicitly observed that U and K are to be given by power 

series in X t .. x t „ without constant terms, and with functional determinant not 0 

at the origin. 

The proof that the condition is sufficient is immediate. The transformation 
associated with the modified equations after the introduction of a new varia¬ 
ble f ——t is precisely the inverse T~ % of the transformation T associated with 
the given equations, inasmuch as when r decreases by 2 n, the point x) evidently 
goes into xf\. But we have 

r-« - ( UV)-' - Ff7- U( UV) U • - UTU-' % 

because of the involutoric property. Therefore the transformation T~ l associated 
with the transformed equations in x t x n , x is equivalent to the transformation T 
under the formal group. It follows that the modified system is equivalent to 
the original system under the formal group. Hence by the theorem stated above, 
the periodic motion is of completely stable type. 

The proof that the condition is necessary may be made by means of the 
normal form of the differential equations, obtained as a result of a transfor¬ 
mation of the formal group. This form is 

-('•!./<> 

where J/ ( are pure imaginery power series with constant terms x,| — 1 in the p 
products £i»7i such that 

m t X t + .... + m,X. + 2 7tp ± 0 


(‘) See my book, chap. IV, section 7. 


411 



COMUNICAZIONI 


8 


for any integers m M , p not all zero. The formal solution of these equa¬ 
tions is J*® f —1/®» 

( 6 ) ti-Re , rji—rfr p) 

with corresponding formal transformation T 

( 7 ) Pi = e t e , 

This transformation is of general stable type since its multipliers 

1 * 1 * V=i 


have the prescribed property. (Of course this transformation ( 7 ) takes real 
form when real variables are employed instead of the conjugate imaginary va¬ 
riables T)i). 

Now ( 7 ) may be regarded as the normal form for a completely stable trans¬ 
formation T in normal variables, just as (6) gives the analogous normal solution 
of the completely stable differential system. Introduce variables £ it rj t as defined 
by the following set of equations: 

V (i—1,...., p). 


thus defining of course an involutoric transformation U in the original variables 
which appear in T. Further define the similar transformation V 


W-e Vif 


-iwjrVL 


where denotes the expression for M, after the products Gjrp (/— 1 ,...., p), 
are replaced by We find from these transformations that 

erw-lvi-ftW'- 


Hence Ml is equal to Ml formally, and the transformation V is also of period 2 « 
It is at once verified that the compound transformation UV is precisely the 
transformation T expressed in terms of the normal variables. 

Thus the condition is necessary as well as sufficient. 

The decomposition of T into the product of two involutoric transformations 
is by no means unique of course. The above discussion shows that this decom¬ 
position r— UV admits of being so chosen that each component transforma¬ 
tion U, V leaves unaltered every formally invariant function under the complete 
transformation T. 

It is interesting to observe the relation between my earlier theorem and the 
present theorem. The earlier theorem applied to transformations T of completely 
stable type shows at once that T~ i is equivalent to T t i. e. that a transfor¬ 
mation A exists such that • _ , . m 4 , 


412 



G. D. Birkhoff : A new criterion of stability 


9 


Here A is given by formal power series without constant terms and with func¬ 
tional determinant not 0 at the origin. The present theorem affirms that A may 
be made involutoric in addition. 

An inspection of the above proof shows at once that if the normal form be 
considered as characteristic of equations of true dynamical type, whether the 
variables ( it rj t be real or conjugate imaginary series, then the property that T 
be expressible as a product of two involutoric transformations is equally charac¬ 
teristic of the transformations of dynamical type. 

Furthermore, if a periodic motion is of completely stable type, so is also 
the periodic motion thought of as described k times (k> 1 ). But in this case 
the associated transformation is T k . Thus in this case the & ,h power of a com¬ 
pletely stable transformation is also of completely stable type. This result is 
perfectly general, for with any completely stable transformation T may be asso¬ 
ciated a corresponding completely stable system of differential equations. 

The converse is also true: if a transformation T* of general stable type can 
be expressed in the form (UV) k , then T* will be completely stable. In fact we have 

T* -* -( VU) k - V( UWU-' - VT r-> 

so that T*~ l is equivalent to T* which, as was pointed out above, necessitates 
that T* be of completely stable type. 

Moreover any completely stable transformation admits of being expressed in 
the form ( UV) k for k arbitrary. In fact if we use normal variables as 

above, the given transformation appears at once as the formal k' h power of the 
transformation a.^ 

vV'-vT'e * . 

In the same connection we remark also that any transformation T of genera i 
stable type which may be written 


r- UVU* V t 

where U \ V, U * are involutoric, must be of completely stable type, for we have 

7 T_ ‘ — VU* VU- UTU 

The converse is also true since according to what has been proved above we 
may take U*=U. 

This last form of expression of T admits of further generalization. 

In connection with the new criterion for complete stability there are at once 
suggested a number of very interesting questions. It may happen that there 
exist one or more decompositions of T of such types UV, (UV) k t UVIPV as 
are given above, in which the involutoric elements involved are given by con- 


413 



10 


COMUNICAZIONI 


vergent series. For example, in the restricted problem of three bodies (‘), the 
associated transformation T has the form UV. Similarly in the general geodesic 
problem on a convex surface, T has the form UVU* V (*). 

The existence of such actual decompositions of these types is highly 
significant for the dynamical problem. In particular if the transforma¬ 
tion T associated with a given periodic motion of completely stable type 
has the form UV actually and not only formally, there must exist infinitely 
many periodic motions in the neighborhood of the given periodic motion 
which intersect a particular analytic p dimensional surface I in x,,...., x tfi 
space, at least if a certain related functional determinant | a ( j | does not 
vanish, where the constants ajj appear in the p series M, involved in the 

normal form : h _ 

(8) 3/, — M + 2 + ••••)^“ !• 

/-i 

Before outlining a proof of this statement we observe that this result presents 
a partial 2 p dimensional generalization of Poincare’s last geometric theorem 
to dynamical systems having an arbitrary number of degrees of freedom, appli¬ 
cable in case T admits of such an actual decomposition UV, since the iterates 
of T have infinitely many nearby invariant points corresponding to these nearby 
periodic motions. Unfortunately such an actual decomposition of T does not appear 
to be always possible. 

To prove our statement, we consider first the surface I of invariant points 
of the involutoric transformation U. 

As noted above, the inverse transformation T~ l is equivalent to T by means 
of the change of variables corresponding to U. Upon the basis of the normal 
form for T above, this fact determines the form of U to be essentially 

h—ySt, yi-f t . /*) 

where f, are power series in the products ( t yt,.~., £ M y M with constant term 1 
and such that £ are conjugate to f t ( 3 ). By modifying the normal variables 
to *,/r 7 „ rp) 1 i we may furthemore reduce ft to 1. 

(«) Cf. my paper: The restricted problem of three bodies. Rendiconti di Palermo, vol. 39 
(1915), sections 19, 20. 

<*) Cf. my book cited above, chap. VI, section 10. 

(») See the analysis in chap. IV, section 7 of my book where it is shown that the most 
general transformation of variables preserving the normal form is 

»?. =- n*o> 

where ? t , a, are arbitrary conjugate power series in the m products £,»7i not 

lacking constant terms. This yields and thence figifid 1 if U is to be 


414 



G. D. Birkhoff : A new criterion of stability 


11 


In the original real variables, say x it y t (i—l r _, /*), instead of such conju¬ 

gate imaginary variables, this transformation takes essentially the form 

Vi -y<+_ (t-1,...., m) 

where the indicated terms in the series are of the second or of higher degree. 

Putting x t =x it y 4 in these equations we obtain as the formal equations 
of the invariant surface / 

*+^ 4 -0, _2y,+ tf,-0 (t-1.. fx), 

where the functions F ft O t are convergent power series in .. x H , y t y h 

beginning with terms of higher than the first degree. 


If the second set of these equations are satisfied it may be proved that the 
first set of equations are satisfied in consequence of this fact, and thus that I 
is an analytic n dimensional surface through the origin in x t , y,x„, y„ space. 


In fact we have for i — l,... n ft 



ii-P.+l-'iQo 

’ll—Pi —1—1 Qi, 



Vi-Pi—f—lQi, 


where „ 






are real power series in the variables 

Xix r ,, y y„, and P lt 

Qt arc the 

same power series in z,x„, yy„ 
thus may be written formally 

. The transformation from z,, 

Vi to x„ Vi 

p,-p„ 

Qi--Qi, 



and the formal equations of the invariant surface I reduce to // equations only: 

- <i- 1, 

Now the 2 fi equations equations earlier written in which only convergent 
series appear may be given the form 

*+^ # - o, Q. + or- o (t-i, ,i) t 

in which F, m , O; are power series in P t Q t Q ft . These of course 
must be formally satisfied if and anly if Q.-O for t- 1 ,...., ,i. Hence every 
term in F* t G* must contain at least one factor Q t Q h . Hence also the 
last fi equations give not only a necessary, but also a sufficient condition 
that x it tjj is an invariant point of /. 


involutoric, where f.g, and f, 0i are the same power series in the m products, save that in 
the second the products Sfa appear instead of Hence we infer f,y t = - i, which 
yields the stated conclusion. 


415 



12 


COMUNICAZIONI 


By proper algebraic manipulation of these equations for I and change of 
variables they may clearly be given the form 

(9) Vi + — 0 (t — 1,...., /*), 

where the omitted terms are of as high degree m as desired. 

Now it may be proved that under these circumstances T lk) has the form 

*— Xt cos 2JcnN, — y l sin 2knN, -f /? 4 ' 
y* 4 ' — x t sin 2JcnN, + y, cos 2kjzN t + S\ k} 

M 

where N, denote the real series broken of at terms of degree m, and 

where Sf* are of arbitrarily high order m in e, with 

«*-x*+tf+_+*i+j£, 

provided that k be of order not greater than Furthermore, the first partial 
derivatives of R{\ St' will be of order at least m in e, throughout the same 
range of values of k (*). 

Our next step is to prove that the surfaces I and /* (i. e. the k lh iterate 
of I by T) will intersect for k large. The equations determining I k are evidently 
obtained by substituting the expressions for *J“ W , y\~ k> in (9) for x lt y t respec¬ 
tively. Evidently if y, be eliminated by means of the equations (9) the equations 
so obtained take the form 

(11) Vf'mxi sin 2kxN t + JlT’-O 

in which \V] L ' are functions of x,x„ only, of order r/i in e, and with first 
partial derivatives of order tn in e. 

But in the range under consideration we can choose k so large and x t x /( 
arbitrarily small and comparable to one another so that the quantities 

2 kit, - 2 k(X, + ^u,/|+....) (t -1,--, h) 

are integers. This choice is possible only because \a^\ is not zero by hypothesis, 
and k may be arbitrarily large in comparison with It is apparent that at 
such a point the functions on the left in equation (11) are arbitrarily small 
of order m in e, while has 2A^^^ u ayx‘ I 0, as its principal part in the neigh¬ 

borhood of the selected point x! w ’. These /i* quantities are in general large in 
absolute value, and nearly constant. Since |a <; |=fcO, and x t ,...., x f/ , are all of 


(•) The argument necessary to establish these facts is based entirely upon the known 
form of T, and involves such inequalities as appear in chap. VI, section 2 of my book. 


416 



G. D. Birkhoff: A new criterion of stability 


13 


the same order as e, it is clear that the functions <P k) must simultaneously 
vanish in the neighborhood, so that there must be such a point P of intersection 
of P k) and I, as stated. 

Let P be such a point of intersection. If we write Q=T~ k (P), we have 

UQ=Q, UP~P, P-T k {Q). 

UP-{UV) k Q- U(VU) k UQ 
P=T~ k Q, or Q~T k P. 

P-T*Q-T*P 9 

so that P is a point of the surface I invariant under the involutoric transfor¬ 
mation T, which is invariant under T tk . Such a point P evidently corresponds 
to a periodic motion intersecting the analytic p dimensional surface I in the 
original x lr ...,x„ space, and lying arbitrarily near to the given periodic motion. 

This establishes the statement to be proved. 

The question arises naturally whether or not it is possible to draw similar 
conclusions in other cases, for example in the case where T may be written in the 
form UVU'V referred to above. I do not know whether or not this in the case. 


Hence 
whence also 
Therefore we obtain 


417 



Reprinted from Bull, de la Soc. Mathem. de France , 1932, Vol. 60, 

pp. 1-26. 


SUR QUELQOES COURBES FERM&ES REMARQUABLES; 


Pah M. George D. Birkhoff. 


Introduction. 


Scion Poincard ('), toutc transformation T biunivoque, dirccte 
et continue do la circonfdrence d’un cercle cn lui-mdme 

T; 0,=/(0) (0, coordonnee angulaire), 

cst caractdrisdc par un coefficient unique de rotation r, tel quo 
tout point 0 du cercle est transform** en 0„ par T" ( 2 ) ou 

/IX-2K<e n - 0</lT-+-2«. 

Done r peut dtre regardd comme l’accroissement moyen de la 

variable 0 qui rdsultc de la transformation T; quand ^ est un 

nombre rationnol^* il existe toujours au moins un point du cercle 

dont la coordonnde angulaire s’accroit prdcisdmcnt de' 2 pn apres 
q iterations de T. Evidemment la valeur de r nc ddpend pas de In 
variable angulaire particulidre choisie. 

Considdrons maintenant dans le plan une courbe de Jordan J, 
fermee, sans point double. Supposons de plus qu’il existe une 
transformation T, biunivoque, directe et continue du voisinage 


(») Sur les courbes definies par des Equations differentielles {Jour, de 
Math. % 4* s£rie, i885. p. 228 ). 

(*) T" d*:signe la n‘* ra * transformation itiree de T. 

LX. 1 


418 


— 2 — 


annulaire dc J et quo cette transformation Iaisse invariante la 
courbc J, c’est-&-dirc queT(J) = J. Naturellcmcnt T Iaisse .inva¬ 
riante la region cxterieurc ouverte s e , aussi bien que la region 
intdrieure ouverte S,. Mais, par definition 'mdinc, J est l’image 
biunivoque, directe et continue de la circonfdrence d’un cercle. 
Par consequent la transformation T donne lieu a un coefficient dc 
rotation r, qui indique raccroissement moyen d’une variable 
angulaire quelconque .sur la courbe de Jordan J, produit par la 
transformation T. 

Passons maintenant a une courbe fcrnuSe C, du type suivant 
beaucoup plus general; fensemble ferine C divise le plan en deux 
regions ouvertes simplement connexes. S* et S/, dont S r conlient 
le « point a 1’infini », de sorte que tout point de C apparlicnt ou a 
la frontierc de *S 0 ou a celle de S # , ou a toutes les deux. Suppo- 
sons done qu’il existe une transformation T, biunivoque, directe 
et continue du voisinage anuulaire de C, dont C est une courbe 
invariante, e’est-a-dire T(C)sC, et par consequent S, et S<- sont 
des regions invariantes. Que peut-on dire dans ce cas? 

Pour edaircir cette question, il convient de considercr s<5par<5- 
ment les points P exterieuremenl (interieurement) accessibles, 
c’est-4-dire les points P qui sont les extrcmitds d’un arc de Jordan 
exterieur (interieur). II est tout a fait evident que de tels points 
accessibles restent accessibles du mdine cole apres la trans¬ 
formation T, et que leur ordre cyclique n est pas uiodific par cette 
transformation. Le raisonnement de Poincard nous montre nlors 
qu’il existe un coefficient de rotation associe avec les point* 
accessibles ext^rieurs et, dc nnhne, un coefficient de rotation t/. 
associe avec les points accessibles int6rieurs. 

En effetC), supposons que la region ext^rieure S r soil trans¬ 
form^ en la region extdrieure a un cercle par une transforma¬ 
tion conforme. Selon les r6sultats bien connus de Carathdo- 
dory ( a ), le voisinage de tout point (extdrieurement) accessible 


(') Le raisonnement de ce paragraphe est au fond superflu, parcc que le rai- 
sonnemenl de Poincarl s'applique sans modification <k ('ensemble des points 
accessibles. Ncanmoins il peut £lre utile de ramener la question & la forme con- 
sid£r4c par Poincar*. comme nous le faisons ici. 

(*) Voir , par exemple, Biebrrbach, Lehrbuch der Funktionen Theovie. t. 2, 
Chap. 4. 


419 




— 3 — 


sera alors transformd on lo voisinagc d’un point correspondanl 
de la circonfdrence de farou quo fordre c>clique des points acces- 
sibles et de lours images soit lo inline ('), el ces images seronl 
partout douses sur la circonfdrence. II s’ensuit quo la transforma¬ 
tion T domic lieu a uno transformation T* biunivoque, directc et 
continue du cerclc on lui-mdme. parce qu’on pent etendre la defi¬ 
nition de T* a tout lo cerclc par continuity. 

Cette transformation T* du cerclc on lui-meine possedc done 
un coefficient de rotation z e . I)e la inline inanierc nous ddfinis- 
sons un coefficient dc rotation r,, par rapport a la region intd- 
rieurc S/. 

£v idem in ent la propriety capitale dc ces deux coefficients de 
rotation ?> et z, est la suivante : Par la transformation T" ( n qucl- 
conquc), tout point dc C accessible d un des deux colds se trouve 
transformd en un point accessible du indme cdld. nvec un nombre 
dc rotations completes compris entre 

— — i et — i 

UR UR 

on r indique i> on r,. selon le cold considdrd. 

Cette propriety nous inontre quo les coefficients z r et z,- sont 
intrinsdquement nttachds a la courbc C et a la transformation 
associde T, ct nc sont changes par aucune deformation biunivoque 
et continue du plan. 

A premidre vue, on pourrait croire quo les deux coefficients de 
rotation z e et r # doivent dire dgaux. C'est noire but principal de 
rnontrer ici quit nen est nullement ainsi , en conslruisant des 
courbes C avec des coefficients inegauxcl de plus tel/es que tout 
point de C appartienne it la fois aux frontihres des deux regions 
exterieure et interieure. 

En admettant pour le moment ce rdsultat, nous pouvons en 
donner un autre dnoned peut-etre plus frappant: 1 1 existe de telles 
courbes C et des transformations correspondantes T, telles que 
tous les points accessib/es de Vexterieur sont avances , par T, 
vers la gauche (par exemple), tandis que tous les points acces- 
sibles de Vintetieur sont arances par T vers la droite. 


(') Chaque point accessible doit 6tre comptc unc fois pour chaque \oisinage du 
point dans S r . 


420 




— 4 — 


C’est-4-dire que nous pouvons toujours trouver dcs cntiersm ct 
k, tels que 

mx e —2*jr>o, mxi —?As<o. 

En effet on peut toujours trouver un m suffisamment grand 
(positif si <r e > <Ji et n6gatif dans le cas contraire) pour que 

>2 r., 

et par consequent un entier k tel que 2 Arc se trouvc entre mov 
et ma e • 

Considdrons maifitenant la transformation T deiinie de la 
manure suivante : T est le produit de la transformation it6r6e T m 
cl d’une transformation de rotation d’un angle — 2 Arc autour d’un 
point situe k 1’intdrieur de S,-. Evidemment cette transformation T 
est du mt'me type general que T et, de plus, clle aura pour coeffi¬ 
cients de rotation 

x e = mxg — 2 Ar>o et x,= wt,— 2 A-jt<o, 
d’ou la propriety d<$sir£e. 

2. Definition de certaines transformations auxiliaires A. — 
Pour construire unc telle courbe C nous dcvrons employer cer¬ 
taines transformations auxiliaires A, dddinies de la mauiere sui¬ 
vante : A est une transformation biunivoque, directe, analytiquc 
et conservative ( 1 ) d’une couronne cn elle-m£ine qui fait tourner 
toute direction radiale vers sa gauche, sauf peut-^tre sur les deux 
bords circulaires( 2 ). 

Un excmple tr6s simple d’une telle transformation est le sui- 
vant : 

(U) r,= r, 0, = 0 -+- >. c/; s (c>o). 

Ici r, 0 sont des coordonn^es polaires. La couronne .peut <krc une 
region quelconque o < a < r < 6. Par cette transformation, toute 


(') C’est-^-dire T conserve les aircs. 

(’) Plus pr£cis6ment, Tangle entre la direction transformee ct la direction 
radiale au point transforme peut itre regarde comme une fonction analytique 

positive -]/ avec o^^< ^ sur les deux bords circulaires. 


421 



— 5 — 


lignc radialc 6 = a cst Ira ns formic en la courbe Q = a -f- X -+- cr- 
dont la direclion dc la tangente est a gauche de la direction 
radialc. En eflet l’anglc t}/ entre la direction radiale ct la direction 
de la courbe cst arc tang 2 rr 2 . 

Remarquons aussi qu'on peut obtenir sans difficult^ d’autrcs 
transformations du m^me Par exemplc, soit B une transfor¬ 

mation biunivoque, dircctQ, analytique et conservative dc la cou- 
ronne en elle-m£me, mais nc satisfaisant a aucunc autre condition. 
La transformation BU, pour c > o suffisamment grand, appar- 
tiendra encore a la classe dc transformations auxiliaircs A. 

3. Lea courbes invariantea de A. — Dans ce qui suit, nous 
aurons besoin dc quclqucs propriety des courbcs invariantes 
d’unc telle transformation auxiliaire A, courbes qui divisent la 
couronne en deux autres. J’ai 6tudi£ autrefois ces courbes qui sont 
d’uno haute importance pour la Dynamique thiorique (*), ct jc 
rdpdterais ici la suite des id6es jusqu’au point n^cessaire ( a ). 

Soit r une courbe invariante (juelconc|uc de cette espece qui ne 
coincide pas avec un des deux bords circulaires dc la couronne. 
Je dis qu’wne telle courbe invariante T sera rcncontree une fois 
seulernent par tout rayon , de fa$on que son equation puissc 
s'err ire 

r =/(.*), 

oil f est continue et periodique de periode 211 en 6. 

Evidemment T est une courbe ferm6c dc Jordan dc type 
simple ( a ). 

Remarquons en premier lieu que la forme explicile de liqua¬ 
tion lisultc immtidialement dc la propri<H6 g6om6triquo 6noncde. 
En eflet la fonction f(Q) qui intervient est bien d£flnie et p6rio- 
dique, de periode 2 ?r, en raison de cette propri<H<$. De plus cette 


(•) Surface Transformations and Their Dynamical Applications (Acta 
math ., t. 43. 1922 , section 44). 

(*) Le present article joint \ nn autre, Sur I'existence de regions annu- 
laires d'instabilite dans la Dynamique, qui vient de paraltre dans les Annates 
de TInstilut Henri Poincare, ont did Merits de facon & les rendre independants 
de mon long Memoire des Acta. 

(*) En suirant un peu plus loin la suite d’iddes, on ddmonlre que T est mime 
rectifiable. Voir mon article des -4cfa. 


422 




— 6 — 


fonclion doit £tre continue pour toute valeur 9 0 de 0; autrement 
nous pourrions trouver une suite 6,, 0 2 , —, avec liin0„ = 9 0 telle 
que 

lim/( 0,,) = g 5*/(0 o ). 

M SS * 

Mais nlors le ppint (0 O , g) appartiendrait a T qui est un ensemble 
fermii de points. Celle conclusion conlrodirait noire hypothec 
que la propri6t6 g6oindtriquc est v<$rifi6e, parcc que In ligne 
ra Jiale 0 = 0„ rencontrerait T au nioins deux fois. 

Faisons une autre remarque pr£liminaire. Dcfinissons I\ comme 
le cdl6 intdrieur de I\ e’est-a-dire I’ensemble des points Iiinites de 
la region intdrieurc. Selon noire definition cet ensemble est lui- 
meme une courbe fermde; sa region intdrieure coincide avec S, 
tandis que sa region ext^ricure conticnt non seulemenl les points 
de la region ext^ricure S«, de T, mais aussi les points de T qui 
n’apparticnnent pas a T,-. Nous dcfinissons le edtC extCrieur T e de T 
d’une maniCre analogue. Si Ton pouvait dCmontrer la propriety 
gComCtrique pour les courbes T t et r e% on voit tout de suite que 
ces deux courbcs devraienl coincider cl que T = T, = r, possCde- 
rait la mCme propriCtC gCornCtrique. II suffit done de demontrer la 
propriety gComCtrique pour les courbes T,- ct J\.. 

Dans notre demonstration, il nous convient de regardcr 9 et r 
comme les coordonnCcs rectangulaires du plan, avec I’axe des 9 
dirige vers la gauche. La couronne a<r<b sera ainsi representee 
par la bande horizontal entre r = a et r = b, el la courbe T, sera 
representee par une courbe periodique ouverte qui s’Ctend indCfi- 
niment & droite et a gauche dans cette bande. 

Faisons maintenant I’hypothese que la propriete geometrique 
n’est pas* vraie pour la courbe I\ dans un cas quelconque. Consi¬ 
ders la region des points accessibles de r = a le long d’une 
ligne^ verticale 9 = const, du plan r, 9 sans rencontrer la courbe 
invariante I\- ( voir la figure i). Cette region ouverte S; ou bien 
coincide avec S, ou n’est qu’une partie de S t . La fronticre de S,* 
est une courbe ferine qui est composee des points de T,- et des 
segments ouverts des lignes 9 = const, situes k l’interieur de S,. 

Supposons en premier lieu que S’ et S, coincident. Dans ce cas, 
l’ordre cyclique des points de T, est Cvidemment celui dans lequel 
la coordonnCe 9 croit ou au mains ne diminue pas, et Tangle que 


423 



— 7 — 


toute cordc AB de la courbe (entrc deux points accessibles A, B, 
A^B, A prtieddant B) fait avec l’axc des 9 peut Stre consid6r6 

. k 3« 

comine compns eutre - et — • 

D’apres noire hypothdse il existe alors au mo ins une ligne 9 = 9 n 
qui rencontre I\ en deux points (a, 9 0 ), ((3, 0 O ), (a < [3). Mais, par 
definition de S/ = S,-, si ie point (p, 9 0 ) apparlient a SJ, tout 
point (p, 9) avec p < p lui apparlient aussi. De plus, tous les points 

Fig. i. 


B 



sur le segment qui joint (a, 9 0 ) a ((3, 0 O ) appartiennent ou a SJ =S/ 
ou a sou bord. Evidemment un tel point nc peut pas apparlenir 
a SJ. Par consequent la ligne 9 = 0 o rencontre I\ en un soul 
segment AB. 

Soient C un point a l’intericur de AB, et C un point dans lc voisi- 
nage immediat de C, qui appartient a SJ = S,-. En supposant pour 
le moment qu'un tel point C se trouve a la droite de AB, on voit 



( fig . 2 ) que la corde DE interieure a I\, obtenue en prolongeant 
la ligne AC de C jusqu’aux premiers points D, E de T iy aura une 
tangente presque parallele k l’axe negatif des r. En cc cas la trans¬ 
formation T transformer ce segment en une courbe D,E, ou E, 


424 






— 8 — 


suit D, sur r„ dont la direction de tangenlc fait pnrtout avec I’axc 

dcs e un angle qui est jtlus grand que nous nous souvenons 

du fait que T doit fairc tourner toule direction verticale vers sa 
gauche dans le plan des coordonnles rectangulaires /*, 0. 

Maisceltesituationn’estpaspossible. Enefict, la ligne brisleFED, 
composite du segment vertical FE avec F sur r = a et ED, sera 
transformed par T en F,E,D, ou F,E, et E, D, sont des courbes 
analytiques donl les tangentes font des angles partout plus grands 

que £ (voir la figure 2 ). II saute done aux yeux quo, dans ces cir- 
constances, E, ne pout pas suivre D, sur T/. 

Remarquons quo le raisonnement ci-dessus suppose que toute 
ligne DE presque verticale doit dire divide par T a droite dc la 
direction verticale. L’uniformitl ainsi supposed s’obtient Ividcm- 
ment si toutes les directions vertieales sont dlvieds dans le mime 
sens par T (et done par T“'). Mats, d’apres les conditions impo- 
sees, cela no doit pas lire vrai pour les directions vertieales sur 
les bords r = a et r = b. Pour Ivitcr cette difficult^, il suffit 
d’employer en un tel cas des courbes convenables avec des direc¬ 
tions vertieales sur r = a ot r = b au lieu d’unc ligne droite telle 
que AC. 

Dc la inline inaniere, en cmployant T~' au licb de T, on dlduit 
qu’un tel point C ne pent pas exister dans le voisinage immldiat 
de C a gauche de AB. 

Par consequent, le point C ne pent pas lire un point limile de 
la region intlrieure ouverte S,. ce qui serait nbsurde. 

Nous devoos done supposer que S* forme seulement une panic 
de S/. II doit alors exister des segments AB de la front iere S; avec 
des points interieurs C qui sont des points intedieurs de S, (voir 
la figure 1 ). 

Considedons maintenanl la partie S, de S, accessible de r = o le 
long d’une courbe rlguliere simple (telle que MN dans la figure 1 ) 
dont 1’angle de la tangentc avec TaxedesGest loujours au moins 
Cette partie dc S, inclut Ividemmcnt SJ, mais ne pout coincide!* 
avec S; que s’il n’existe pas de segments de la fronliere de S/ du 
type AB avec une partie comprise entre AB et la courbe a droite 
de AB (voir les regions t de la figure 1 ). 


425 



— 9 — 


Par la transformation T qui fail croltrc Tangle do toute tan- 
gento 6gal a y et no ramone a y aucun anglo plus grand quo y» les 

points do S, sont transformers on point do S, qui sonl accessiblcs do 
la ntmc mantra. Kn olFot, 1'image d’unc courbc auxiliairo MN 
qui rood accessible lo point N est ollo-indino une courbc nuxiliairc 
admissible M, N, qui rend N ( accessible. Done, l'iningo do S,parT 
doit coincide!* avoc S,; olio no pout pas on <Mrc soulemout une 
partio pnrcoqucT ost consorvativo('). Mais Ios points inaccossiblcs 
do colto maniero <| 11 i sont un pen a droito d'un tel segment AB 
(s’il oxislc) doivont aussi £tro transform^ on dos points accessiblcs. 
II on resultc qu’im tel segment n'existo pas. 

Do la inline facon, on employant la transformation inverse T~', 
on conclut qu*il n’oxiste pas noil plus tin segment AB nvec une 
par tie dc S, a sa gauche. 

Par consequent, il n’oxiste aucun segment AB. Nous arrivons 
on conclusion an rtfsultat enoned. 


•4. Les coefficients de rotation. — Noils allons inainlonant con- 
sidoror les relations miiliiollos dos courbos T cjui sont invariants 
par une telle transformation T. Parmi ces courbos, so trouvent 
toujours Ios deux bords r = a ot r = b de la couronne. Kvidcin- 
mont. cliaquo courbc invarionto T possedc un coefficient do rota¬ 
tion r. Scion une propriete fondamcntulc dcs cooflicionts do rota¬ 
tion cite plus haul, toute courbc avoc lo coefficient de rotation de 
la forme * (/>, <j entiors) contient an moins un point dont Tangle 
est augment dc par la transformation Tv; un tel point est 

done g^onttriquement invariant par Tv. 

Si deux courbes r t ct 1% ont des points communs , leurs coef¬ 
ficients dc rotation sont egaux ct de forme —•> de. plus 
leurs points communs sont invariants par T q ( 2 ). 

Si r f ot r* ont des points communs, il y a une ou plusieurs 
regions ouvertes simplement connoxes entre T, et T 2 . Par la trans¬ 
formation T, une telle region est transfornte eu une autre de la 


(') Nous faisons abstraction ici dc I’cnsemble dc mesure o dcs points de S,- 
qui n’appartienncnl pas & TfS,). 

(’) Voir se< lion 46 dc inon article dcs Act't pour des rdsultats analogue*. 


426 



— 10 — 


mime esp£ce. Mais ces regions obtenues en parlant de l’unc d’elles 
par Filtration indtfinio deT ne peuvenl pas etrc toutes difltrcntes, 
parce quc Taire totale csl finie et la transformation T conser¬ 
vative. Done, aprts q iterations, la region donnte se trouve encore 
dans la intnie position geomttrique, et les deux arcs de T, et r> 
qui forment sn fronliere sont transformers en eux-mtmes. Les 
extrtmitts de ces deux arcs sont done gtomttriqucinont invariantes, 
d’ou le rtsultat enonct. 

Si deujc courbes T, et I\* n'ont aucun point comniun le coeffi¬ 
cient de rotation dc la courbe earterieure est plus grand que 
celui de la courbe interieure. 

Rappelons quo la transformation T transforme line ligne 
0 = const, en une courbe dont Tangle de la tangente est partont 
plus grand que 'j dans le plan dcs coordonnecs rectangulaires r, 0. 
Done, il est Evident que la coordonnte 0, du point transform^ sur 
la courbe exttrieurc T, sera suptrieure d’au inoins c a la coor- 
donnte du point transform^ surTj, si Ton part de deux points sur 
le intmc ravon. 

Ce fail nous montre que le coefficient dc rotation de la courbe 
exttrieure T doit tire au inoins aussi grand que celui de r a . En 
eflet, deux points correspondants sont transformers successi- 
vement en des points tels que ceux de F, sont toujours plus 
avancts que ceux de r a d’au inoins e. 

11 ne reste done plus qu’A exclure l’tgalitt des deux coefficients 
de rotation. 

ConsideTrons en premier lieu le cas ou ce coefficient est de la 
forme 

Imaginons maintenant que nous faisons suivre T? d’une rotation 
d’angle — 2 pn. Nous obtenons ainsi une transformation T* de 
m^me type, dont T, el T 2 sont des courbes invariantes avec un 
coefficient de rotation nul. II existe done un point A, de T, et un 
point Ao de T 2 qui sont invariants par T% et ces deux points ne 
peuvent pas avoir la me*me coordonneTe 6 en raison de la propriety 
geTomeHrique de T\ 

Construisons un quadrilatere curviligne de la maniere suivante : 
Deux des c6t4s sont les deux segments verticaux compris entre T, 


427 ’ 



— 11 


et r 2 qui contiennent respectivement A, et A 2 ; les deux autre* 
cdtds sont les arcs de T, et T 2 enlrc ccs deux colds verticaux. Con- 
siddrons l’image de ce quadrilalerc par T # . Les sonunels A, et A 2 
sont invariants, mais les deux coles 0 = cont. sont transformds en 
deux courbes, dont I’angle des tangentes est partout plus grand 
que -• Par consequent, les images de ces deux cotes se trouvent, 
l’une a droite de A,, l’autre a gauche de A a . Mais en ce cas, 
l’imagedu quadrilalerc enfermernit a son interieur ce quadrilaldre 
Itii-mdmc on bicn scrait incluse dans celui-ci, selon que A a est 
situd a droite on a gauche de A 1 . Puisque T* est conservative, 
cela n'esl pas possible. 

II reslc done a considdrcr le cas on le coefficient de relation r 
n'est pas de celte forme spdcialc. Dans ce cas, nous pouvons trou- 
ver iin enlier «y pour lequcl 

opr,— 5 < q-. < */>*, 


on p est enlier et £ > o est la quant itc £ signalee ci-dessus (* ). 

Tout point de T 2 doit alors dire axaned par Tv d’un angle infd- 
rieur a apn. En elfet, il v a des points de T, qui sont avaneds de 
nioins de 2 />tt par Tv, puisque le coefficient de rotation de Tv est 
7 T 2 pit* Done, si un point dtait avance de plus de 2 /> 7 T, on 
Ironverait des points avaneds exactemcnt de 2 /rn, ce qui est impos¬ 


sible pour - 7 ^ -^2 1 

D’autre part, il y a des points de f 2 qui sont a%ancds d’au 


moins 2 pic — ~ par Tv, puisque le coefficient de rotation deTvest 
(/z > 2 piz — * • Les points correspondents de r, seront done avaneds 
d’au nioins 2 /? 7 i -f- Mais le raisonnement fait ci-dessus pour T? 


montre que tout point de T, doit dire avance de moins de 2 7 r p. De 
cette facon, nous obtenons encore dans ce deuxieme cas tine 


contradiction. 


(*) Nous supposons ici que t esl posilif, dans le cas conlraire nous pourrions 
considlrer T u,) au lieu de T. En ddveloppant ^ en fraction continue, on trou- 

vera ^videmment des fractions £ avec la propriety indiquee. 


428 



- 12 — 


Nous allons maintenant compiler notrc r6sultal relalif nux 
courbes T pour lesquelles t est de la forme de la maniere sui- 
vante : 


Toutes les courbes T qui appartiennent au meme coeffi¬ 
cient r = possedent au rnoins un point invariant par T? en 


commun. Dans cet> ensemble , ily a une courbe T, exterieure et 
une courbe r 2 interieure. 


S’il n’y a qu’un nombrc fioi de courbes, ces faits son! dvidents. 
En efiet, Ic bord extdrieur de ces courbes conslitue une courbe T,, 
et le bord inl<$rieur une courbe Tj. Ces deux bords ouront des 
points coininuns d’apr£s ce qui precede, et ces points commons 
se trouveront n^ccssnirement sur tonics les autres courbes T en 
question. 

S’il y a un nombrc infini de courbes I\ cimsiderons la region 
ouverte simplement connexc, extdrieurc a toutes ces courbes. Sa 
fronti^re est evidemmenl une courbe invnriante T*. Coinmc tout 
point d’une courbe T au voisinage immddint de I'* est avanc<$, 
par T /n , d’un angle compris enlre 

7. mp s nip r. 

—-- ix ct ---»- ?.r. t 

<7 9 

on voit quo le coefficient de rotation de T* est aussi • Done, la 

9 

courbe T;, ainsi que la courbe analogue T*, appartiennent a l’en- 
semble r. Ces deux courbes auront ou moms un point invariant 
par T? en commun, et ces points communs devront £tre situ6s sur 
toutes les courbes I\ 

On ddmontre iinmddialement : 

Toute suite infinie epanouissante (ou decroissante) de 
courbes T tend uniformement vers une courbe I\ 

En efiet, une telle suite ddfinit une region ouverte intdrieure, 
invnriante par T, dont la fronti£re est une courbe T que les 
regions de la suite approchent uniform<$mcnt. 

5. Le lemxne fondamental. — Consid^rons maintenant un rayon 


429 



— 13 — 


quelconque Q = 6 0 et marquons sur ce rayon les points de .ren¬ 
contre avec chaque courbe. Cet ensemble de points est ferm6. A 
vrai dire, si les points de rencontre de r,, IT,, ... tendent vers un 
point P de 6 = 9 0 avec r croissant, ces courbes 

I'll T|, Ta, ... 

formcnt unc suite 6panouissante dont la courbe limite T rencontre 
9 = 9 0 au point limite P. On peut opdrer de la mdme fagon si les 
points de rencontre tendent vers un point P avec r d^croissant. 

Marquons plus g6n6raleinent sur 0 = 9 0 tous les points compris 
entre la courbe T cxtdricure et la courbe T intdrieure des courbes 
qui poss^dent un inline coefficient de rotation r == 

L’enseinble des points et intervalles ainsi marques est 6galement 
fernnS, et chaque point ou intervalle correspond a un coefficient 
unique de rotalion qui croit avec r. 

L/ensemble des points et intervalles marques remplit la ligne 
entre r = a et r = b ou ne la remplit pas. 

Dans le premier cas, rentrelacement dcs courbes et des regions 
simplcnicut conncxcs entre elles remplisscnt toute la couronne; 
par exemple, cette possibility sc trouve realisie par la transfor¬ 
mation de rotation variable U mentionn^e plus* haut. Dans le cas 
g6n6ral, chaque branchc asyinptolique, issue d’un point invariant 
instable par T?, coincide avec une autre branche de la m£me esp£ce. 

Dans le second cas, on trouve un segment PQ sur la ligne 
0 = 6 0 qui ne contient pas de points « marques »> intdrieurs, quoique 
P et O soient marquds. En d’autres termes, il y a une region 
annulaire de la couronne dont les deux bords sont des courbes 
invariantes successivcs qui coupent 9 == 9 0 en P et Q respecli- 
vement. Une telle region sera appeMe « region annulaire d'insta- 
bilitd »» (ring of instability). 

Dans ce qui suit, nous nous appuierons sur le lemme suivant : 

Lemmf. — II exisle des transformations auxiliaires A, du 
type T, dont la couronne contient au moins une region annu¬ 
laire d'instability. 

Dans un article recent (*), j’ai d6monlr£ ce faitj qui, au point 

(') Sur 1'existence de regions annulaire* d'instabilite (Annales de I'/nslilut 
Henri Poincari, tg3i). 


430 




— 14 — 


de vue analvtique, semblc d’avancc presque certain. Coniine 
j’e.n 

ct /•=!. la region annulairc d'instabilitd pourrnit s’etendre 
jusqu’a /* = o. 


ploie dans cel article une couronne comprise cnlre /• = o 


0. Une propri6t6 des regions annulaires d’inslabilita. — Soil 
<lonc R unc lellc region annulairc d’inslabililo enlrc les courbes 
invarianles succossRes T, el l\. d’equalions 

/•=/.(0), r=/j( 0 > |/.(0)>/,<«i] 

ou f\ el fi soul continues el periodiqucs de pdriode 27:. 

Dans le vo is inage inimediat de tout point de r,(l\.) il e.risle 
des points dont t/ueU/ues-unes des images par T, T-\ . . . so 
trouvent dans le voisinage inimediat d'un point donne de 

r 2 (r«). 

En cfiTet consid^rons une petite region ou\eiie <7 nutour du point 

donn6 de IV La suite des regions a. T(a), T,(*)_jointe a la 

region r < f 2 (0) cgnlcment ouverle el connexe, conslilue une 
region ouverle et connexe. G, telle quo tout point de In region trans¬ 
form^ T(Ci) appartieiil dvidemment a G. Puisque Tost conservn- 
live, on en conclut quo G Cst une rdgion invnriante par T. De 
plus, In region ouverle simplcment connexe II oxterieuro a (i esl 
invnriante. Done la frontigre do II doit etre une courbo fermee inva- 
riante d aprds noire definition, el doit par consequent coincider 
avec T,, d ’ou le r6sultnt 6nonce. 

/. La transformation definitive 1 1 . — Clioisissons done une 
transformation conservative A qui adineite une telle region annu- 
Iaire d’instabilite 

/*(0)S/</,(0) 

et d^signons les coefficients de rotation correspondant aux deux 
bords r t et r 2 par r t et r 2 (r 1 >r 2 ). Clioisissons aussi u I'intcrieur 
une courbe fernnSe K, de la forme 

r = 0 ) 

ou/est analvtique en 6 et p^riodique de poriode 271, de telle sorte 
que K divise La ire de la region annulairc en deux parties dgales. 


431 




— 15 — 


Soil V e In transformation suivautc : 

Vi: 0, = 0. rf =(i — z)r'- 

ouo<£<i ost line quantity que nous allons choisir suffisam- 
inent polite plus lard. 

La transformation definitive qui nous conduit a noire conclusion 
principale cst la transformalion coniposce T e — AV*. • 

La transformation purement radiale \ t cst anolytique pnrloul 

sauf a I'originc r = o et diminue toutc a ire dans lc rapport (l ' * 
parcc c|no l’on a 

di r'j i dO, = (i — i) d( /•* i dO. 

Par consequent T e jouit des memos proprieics. 

De plus, la transformation \ e laisse invariante la courbe K, 
landis quo toul point a unc distance radiale d do K ost transforme 
vers K radialeinent en un point a unc moindro distance radiale 
de K. Done T e = A X t transforme la region annulairo on une 
partio d’elle-mdiue, 

n h- « \r -n i s ^ </? -«i /? -/* \ 

( voir la figure ci-dessous), dont les fronti^res sont les images respee- 


Fig. 3. 



tives r, et r 2 do T, et r,. La transformation inverse T e n =V{ n A< _,) 
est analytiquc au moins a I’intiSrieur de la region annulairo r’,r' a , 

et augmente toutc aire dans le rapport f ^ • 


432 



— Hi¬ 


ll est ideal quo 1 a ire include entre T, el F, est la mdine que ccllc 
incluse outre T a ct Tj. 

I)c plus il est evident que la Iransforination T e (T e -' ) d6vic la 
direction radialc vers sa gauche (droite). Cette proprietc est fon- 
damentalc pour cc qui suit. 


8 . La courbe invariante correspondante C, 


—-----vw.«voj»vuuouvo — Ln repumm 

encore unc fois la transformation on voit que la region annulaire 
r,r 2 sc trouve transforniee en une autre T 2 T,/ qui lui est inldrieure 
oii I ' = I * ( T| ) et r./ = 17 (r 2 ). Continuant inddfiniinent de cette 


rcpetanl 


manure, on obticut dvidemnient unc suite infinic des regions annu- 
laires T,' r a ' , T 2 r a 9 , . . ., cliacunc a Fintdrieur des regions prded- 
dentes, ou, en gdndral, 


iY‘= T«(r,), 1./* = T t «»(r s >, 

De plus Fa ire dc la n l ' m * region annulaire r," r." est prdcisdinent 

Fair© de T, T* diminuee dans le rapport - ~ * )n , et doit done tendre 

vers z(3ro qunnd n emit inddfinimenl. De plus, a cliaquc etape les 
aires des regions annulaircs T, T ” et r..r"‘ restent dgalcs. 

Par consequent, 1 ensemble de points iutcrieurs a toutes ces 
regions annulaircs est une courbe fermde, C, qui divise la region 
anuulaire initialc en deux parlies ©gales. On voit inunediatement 
que cette courbe est invariante parT, aussi bien que ses deux c6t6s, 
les courbes C r ct C,. On voit aussi que Fcusemble C rt commun a 
C c et C, est encore une courbe invariante par T ; les deux cotes de 
C c * sont identi(|ucs, et il n’cxistc pas d’autre courbe de cette espece. 

1 1 e.Lisle done une courbe C ei parcouranl la region annulaire 
cTinstability r, I*, frontierc commune des regions ouvertes sim- 
plement connexes , e.vterieure y et interieurc qui est invariante 
par. la transformation T e et qui divise T, F, en deujc parties 
e gales. 

C est cette courbe Cc* que nous aliens etudier de plus pres. 

Disons qu’une courbe est roulee vers la gauche par rapport a 
l mterieur s’il existc des courbes regulieres intcricurcs, OP, sans 
points doubles, dont la direction tangentielle n’est jamais a droite 
de la direction radiale qui joignent Forigine O a un point intcrieur 
arbitraire P. D’une maniere analogue disons qu*une courbe est 


433 



— 17 — 


roulee vers la gauche par rapport au point a Vinfini s’il cxistc 
des courbes rtfgulieres O'P, sans point double, donl la direction 
langentielle n'est jamais a droite de la direction radiale dirig^e 
vers O, qui joignent le point O'appartenant a un cercle concentricjuc 
extdrieur a un point extdricur arbilrairc P. 

La courbe C,, est roulee vers la gauche par rapport a I'inte- 
rieur et a I'ejctericur. 

S’il n on cst pas ainsi. supposons par exeinple qu’il existc des 
points P int 6 ricurs a C r i qui ne sont pas accessibles do la gauche 
de O, le long d une telle courbe OP. 

Keveuons niaintenanl au plan des coordonnees reclangu- 
laires /*, 0. 

En premier lieu le point P, qui n'est pas un point dc C„ doit 
ncanmoins nppartenir a C*. Autrcmenl l* serait 1 image d un point 
Q entre r* et I\" par une puissance convenable, T? de T c , cominc 
le inontre la construction ci-dessous. Joignons l\ a Q par une 
ligne verlicale O, Q. L iinage O'^Q"' de ccttc ligne se trouve entre 
r,'* el r?\ et joint I’image O," de O, stir T,' a 1' image Q (u dc Q. 
La direction langentielle de O m Q <n est parlout d 6 vi <$0 vers la 
gauche de la direct ion verlicale sauf peut-elre au point O',. Soil done 
O-j le point de sur la meme ligne verlicale qtic O , 1 '. Considerons 
la courbe 0*0 , 1 Q 1 former par la ligne droite (5a O', 1 ’ et la courbe 
O , 1 Q ,,) . Celle courbe est sans point double, et sn direction tan- 
gentielle n'est jamais a droite de la direction verlicale. Son imago 
O a ‘ 0, a Q* a> aura aussi une direction langentielle a gauche de la 
direction verlicale sauf peut-i'tre au point 0 7 ". En continuant ainsi 
on obtient a la n l " me dtape la courbe 

On-.O," ...O;* P 

du type demands qui joint O n ^i de T* a P = Q". 

Nous vovons done que tout point P a l’int^rieur de qui 
n’appartient pas a C,- doit etre accessible de r 2 par la gauche. De 
plus la methode de construction de notre courbe auxiliaire nous 
monIre (|ue la direction langentielle de cetle courbe sera plus 
divide a gauche quo la direction verlicale d un angle d > O entre 
T/ et c ri . et cela inddpendamment du point P consid 6 rb. 

LX. * 


434 


— 18 — 

II restc k considdrer les points intdrieurs P qui appartiennent a 
C/. Supposons qu’un tel point nc soil pas accessible de r a par la 
gauche. Choisissons un carry autour de ce point assez petit pour 
ne contenir aucun point de C ei . Ce carry doit contenir des 
points P accessibles de la gauche parce que les regions extdrieure 
et int<5rieure a C« ont une aire totale 6gale a celie de la region 
annulaire d instability. Construisons une courbe auxiliaire OP 
(voir la figure ci-dessous). Si cette courbe entre dans le carry pour 

Fig. 4. 



la premiere fois appoint Q par le cote superieur mi par le cdlo de 
droitc, la courbe OQP. on QP <.«( une ligne droitc. rend P acces¬ 
sible de la gauche. En efTVt I angle tangentiel est an nioins r. sur le 

c6t6 supdrieur el rcsle pi.is grand {pic y > ^ a lintdrieur du carrd ; 
sur le cdtd de droitc Tangle d’entrde peul etre * n.ais a lintdrieur 
cet angle cxcddern De 1a meme maniere, si la courbe OP enlre 
dans le carrd par le cdtd infdrieur on par le cdtd de gauche avec 
un angle langcnlic! plus grand que c’cst-a-dirc d’en dessus, 

on pourra employer la courbe auxiliaire OQP de la mdn.e maniere 
qu’auparavant. Evidemment la courbe OP nc peul pas entrer dans 
1 c carrd d’en dessous, par le cdtd de gauche. 

II vient done quo la courbe OP doit entrer dans le carrd d’en 
dessous, par le cold infdrieur. et en uri point Q a gauche de P avec 
un angle compris enlre ^ el r. En lout autre cas on pourrait obtenir 
une courbe auxiliaire^en ajoutanl le segment rectiligne QP a OQ. 

On voit ainsi que OQ doit coupcr pour la derniere fois la ligne 


435 



— 19 — 


verticalc qui contient P cn un point R au-dessous du carr6 avec 
un angle langenticl compris entre * et —• Supposons que S soit Ie 

premier point ou OR rencontre cette ligne verticalc au-dessous 
de P, coniine le montre la figure ci-dessous; un tel pointS ou 
bien coincide avec R ou cst situ6 au-dessus de R. II doit exister 
un point 2 de C et sur Ie segment SP (mais naturcllement 
en dehors du carrd): aulrement on pourrail obtenir unp courbe 
nuxiliaire en ajoutant le segment vertical SP a OS. Nous choisissons 


Fig. 5. 



pour 2 le point de C,t le plus rapprochc de I* sur cette ligne verti¬ 
cal; pour cela remarquons que C,/ cst un ensemble ferm6 qui 
ne contient pas P. 

I/arc RQ se trouve done entierement entre les lignes verticales 
qui contiennent rcspectivement P ct Q, parce que Tare RQ est 
devie vers la gauche de la direction verticalc ou suivant cette direc¬ 
tion. De plus R doit etre au-dessous de 2. 

Faisons maintenant diminuer indefiniinent le carry autour de P. 
Nous obtenons une suite d ares RQ qui splendent d’un point R 
au-dessous de 2 jusqu'a un point Q pres de P, et qui sont situ6s 
entre les deux lignes verticales voisincs Pune de l’autre passant 
par II et Q. On voit done qu un tel arc diflere peu du segment 
vertical 2P. 

Pour montrer la contradiction nous nous souvenons maintenant 
du fail qu’une courbe auxiliaire OP peut etre choisie de fa$on que 
I’angle tangentiel exccdc de d > o, uniform^ment entre r ( 2 9 ’ et 
C,-, et par consequent dans un voisinage fixe de P qui appartient a 
Ci. 11 saute mix yeux que les arcs QR ne peuvent avoir cette pro¬ 
priety sans sortir de la bande verticale etroite entre Q et R, d’ou 
la contradiction annoncee. 


436 




— 20 — 


Ln raisonnement tout a fait analogue montrc quo C ft * cst aussi 
roul^e vers la gauche sur le cote extdrieur. 

II semble probable que, d’une fa<?on pareille, tous les points de 
C ei elle-mdme qui sont accessibles le long d’un arc extdrieur ou 
intCieur de Jordan, sont accessibles le long d’unc conrbe reguliere 
d<$vi<$c vers la gauche (Pextrdmitd sur G*/ (Slant exceptce). 

9. Les points radialement accessibles de C ri . — Sur chaque 
ligne radiate issue de O il exisle un point de C r < et tin soul qui est 
radialement accessible. La totality de ces points se trouve rangde 
en ordre cyclique sur C fl - selon Lord re croissant des valours de 0 
corrcspondantos. 

Tout point radialement accessible ( ' ) de C,/ sc trouve trans- 
forme en. un point de la mdme espece par la transformation 
inverse T~'. 

Pour le ddmontrcr commenoons par demon! rer qu'un point P 
inldrieur a C ei qui est accessible dc la droilcost aussi radialement 
accessible. 

Soil O-P unc courbe auxiliaire device vers la droite. Si 0*P so 


Fig. c. 



trouve enticement a gauche de P (voir la figure ci-dessus), nous 
raisonnons de la manierc suivante. 

Le point P est aussi accessible de la gauche comnie nous Pavons 
d<$montr<$. Une ligne auxiliaire correspondante OP ne peut pas 
couper la ligne vcrticale qui contient P en un point Q au-dessous 
de I> sans avoir d6j a coup6 0‘P par en dcssus, en un point R qui 

(*) S°it du cdte interieur, soil du cdl* extcrieur. fcvidemmcni il suffit dc 
considercr sculement le premier cas. 


437 




— 21 — 


precede Q. Un tel point H nc peut pas existcr. En eflet dans le cas 
contrairc la courbcO* HO nc conlient aucun point de C*./, et C e i doit 
dtrc intdrieure a la region comprise entre ct O’RO, ce qtii n’cst 
pas possible. Par consequent la courbe OP nc coupe ni la portion 
de la verticalc qui conlient P. situde au-dessous de P. ni la courbe 
0 # P, et le point P doit el re radialement accessible parce qu’il n'y 
a pas de point de C,./ a rinldrieur de 0“P0. 

Mais supposons que In courbe O’P ne soil pas entierement a 
gauche du point P. Soil P* le premier point ou O* P rencontre la 
ligne verticale qui passe par P. Evideminent P* doit dtre au-dessus 
de P, en raison de In propridld caractdristique de la direction tan- 
gentielle. Done si nous pouvions ddmonlrer que P* est radiale¬ 
ment accessible, le point P le sera it nussi. 

En considdrant mnintenant le point P* an lieu de P dans le rai- 
sonnement precedent et en employant la courbe auxiliaire O* P% 
qui est divide vers la droite et se trouve a gauche de P*, nous 
ddmontrons par In inline mdlhode que P* doit dire radialement 
accessible, ce qui complete la demonstration. 

Supposons maintenant quo P soil un point de C*/ radialement 
accessible, et considdrons son image P'“ n = T“' (P) par 

Fig 7- 



ou O” est l image du point ou 0*P rencontre P, n (voir la figure 
ci-dessus) et cette courbe est ddvide vers la droite. 

Chaquc point Q interieur a 0“ - P'~ , > est done radialement 
accessible d’apres la demonstration ci-dessus, ce qui nous montre 
que P c_,) doit etre plus a droite que tout autre point de 
On voit aussi que, pour la meme raison, la rdgion D au-dessous 
de la courbe 0*'P ( ” n ne conlient aucun point de C Ci a gauche 
de P<“*\ 


438 



— 22 — 


D’autre pari le rdsultat indiqud subsistc dvidemment s’il n’existe 
aucun point de C e i au-dessous de P<-'>; nous prenons pour 2 le 
point de cetle espdce Ic plus pres de 1’axe des 9. La mdme figure 
nous montre que le point radialement accessible 2 precede le 
point P ( ~ ,J dans l’ordre cyclique sur C ei . Les points accessibles 
de Cei entre 2 et P (-,) sont dvidemment accessibles de la region 
au-dessous de O^P'-') comprise entre 2 el P<-'\ Mais s’il exis- 
tait une partie de lintdrieur de C w ainsi accessible et a droite de 
2P~', on voit que les points de cetle partie ne seraient pas acces¬ 
sibles le long d’une courbe divide vers la gauche. Par consequent 
nous devons conclure que le segment 2 P<-'> C sl lui-mdme In 
partie de C e , entre 2 et P<“*>. 

Mais cetle possibilitd doit dire nussi cxclue. En cfTct il existe 
au-dessous de prds de P<-«>, des points P qui n’appar- 

lienncnt pas a C/. Comine nous I'avons ddja remarqud, un point P 
de cette espdee est accessible de la gauche lc long d’une courbe 
OP divide vers la gauche de la direction verlicale d’au moins un 
angle d> o etjiu-dessus de iy». II est lout a fait dvident que de 
telles courbes OP n’existent pas pour P suffisamment prds du point 

P<-«d©Cw. 

Du rdsultat ainsi ddmontrd, nous ddduisons le suivant comme 
corollnire immddiat : 

Soient P un point de C r< - radialement accessible, et P l * A) son 
image par egalement radialement accessible. Si la diffe¬ 
rence entre les deux valeurs correspondantes 9 et (L* de 9 est de 
la forme 

•2/*<e_4_e<2(/-4-i)K 

la transformation T*, - * 1 avance chaque point de C„ de plus de 

/ — i et de moins de l -+- 2 rotations completes. Si le signe 

d'egaliti a lieu , le point P est avance de precisement l rotations 
completes. 


10. Sur lea deux coefficients de rotation de C,„ — Supposons 
maintenant que le coefficient de rotation intdrieure de la courbe C c ; 
soit t'. D’apres la propri6t<5 fondamentale de ce coefficient nous 
aurons done un nombre de rotations compris entre 

— * x i — et — l-xj - 4 - 2 ?; 


439 




— 23 — 

aprds la transformation Tj“ 4 ’, d’ou 


- k -< x ' ^- 1 - 

scion ce qui precede. 

Faisons 1’hvpothese que quand £ tend vers 0, les moindres 
distances de Ca I\ et a T 2 tendcnl uniform6ment vers 0. Tout 
point P a cettc moindre distance de T 2 est certainement radiale- 
ment accessible, et ses images par Ti”“. T* - ’ 1 , . . . le sontaussi, 
comme nous I'avons demontrc plus haul. D’autrc part pour £ 
petit, Tj _n diflere peu de A (—,J . Par consequent les distances 
de P (-l) , P (-3) , ... a la courbe r* (invariante par A) restent 
petites jusqu’a une grande valcur de k y et ces points seront situ£s 
pr^s de Q (_ ' *, Q ( “ a) , ...» ou Q‘” 1 \ Q ( ” 2 ), ... sont les images 
successives par A ( ”' ) d'uu point Q de T a voisin de P. Le point P (—A * 
sera done avaned d’un angle qui difTere de kri (t,• indique le coef¬ 
ficient de rotation de P. par A), de moins de 4**. Done 

— kx t — \ r.< 0_*_— 0 <— At,-4- 4*, 

d’ou, d*apr£s l'incgalite de la section 8, 

3 )n ^ x — 2(/ — 2)- 

-*-- - H - 


La comparaison de cettc in^galite ct de l’in6galit£ analogue ci- 
dessus, relative a t] nous montre que 


Puisque k devient arbitrairement grand quand £ tend vers 0, 
nous voyons que r* lend vers r 4 *. 

.. De la m^me maniere on d6montre que t* lend vers r e quand £ 
tend vers 0. 

Mais t ( est infcrieur a z e comme nous I'avons vu. Par consequent 
pour £ assez petit on aura t* <C t*; e'est-a-dire que le coefficient 
de rotation de C*, par rapport a la region exterieurc doit d^passer 
le coefficient de rotation par rapport a I’inU&rieur. 

Tout cela suppose que rhvpothese faite plus haul est valable 
quand £ est suffisamment petit. 

II reste maintenant a d6inontrer cette hypothdse. 


440 



- 24 — 

Supposons qu’elle ne soit pns satisfaite, par exemple supposons 
que la moindre distance de C ei a r a ne tende pas uniformoment 
vers 0 quand e tend vers 0. Nous pourrions alors trouver une 
constante k et une suite infinie de valeurs de £, e a , . .., 
avec limc M = 0, telles que toute courbe CJ3* correspondence 
ne possCde aucun point a une distance de I\, inferieure a k. 
Considcrons maintenant les images B ln) par A ( ”> de la bande 

/*< 0 )<r </,(())-+-X, 

au-dessus de r a , remarquons que Cjette bande, par hypot/iese , est 
au-dessous de C tfl pour les valeurs £,, £ a , ... de £. Envisageons 
la region ouverte connexe formce par ces images 

elle est Cvidemment invariante par A et A“\ 

Nous pouvons Ccrire d’une maniere synibolique 

C = lain C„, 

Oil 

C n = B<“"> -4- ...-+- B -4- B'**. 


Les regions ouvertes conncxesC,, C,.C.sVpnnouissenl 

vers C quand n tend vers l’infini. 

Ajoutons maintenant a C„ tous les ensembles fermCs contenus 
dans son intericur. Nous obtcnons ainsi une region ouverte sim¬ 
plement connexe D„. Je dis que Vaire de D„ ne peut pas depas¬ 
ser la moitie de Vaire de la region annulaire. Pour le dCmontrer 
considcrons 

C/i = Ti“ #,, (B) B+... + T g " ( B). 


ou n estarbitraircnient grand et £ une valeur de la suite £,,e a , .... 
Comme C;, et, par consequent, l’aire D; totnle incluse par C‘„ restent 
entierement nu-dessous de la courbc invariante C„ correspondante. 
(qui divise en deux parlies egales i’aire de la region annulaire T,, 
T,), on voit que la region D;, incluse dans C; aura une aire q.ii ne 
depasse pas la moitie de ceile de T,, T,. Dautre part quand £/ 
desient petit, T t tend vers A, on voit done que D* tend vers I)„. 
d’ou la conclusion en italique. 

Par consequent la suite des regions ouvertes, simplement con- 
nexes et Cpanouissantes D,, D 2 . . . . tendra vers une region ouverte 


441 



— 25 — 


simplement conncxe D dont l’aire tolale ne peut pas d 6 passer la 
moiti 6 de 1 ’aire annulaire. II en rdsulte que la courbe fermde de 
frontiere est une courbe invarianle par A qui ne peut coi'ncidcr ni 
avec r 2 ni avec T,. Selon la propri 6 t<£ fondamcnlale de la region 
annulaire d'instabilit<Tune telle courbe n’existc pas. 

Nous concluons done : 

Pour c sufjisamerit petit la courbe fermee invarianle C ei 
poss&de deux coefficients in&gaux de rotation r, et r e respective - 
meat par rapport a Vintirieur et Vexterieur; ces coefficients 
satis font d r, < r e . 

i I. Sur une autre propri6t6 de C ei . — Notons aussi la propri<$t<5 
suivante : 

La courbe C e i ne peut avoir aucun point accessible h la fois 
de 1'exterieur el de Vintcrieur. 

En effet si un tel point P doublement accessible existait, toutes 
ses images P<*)(A*=dti f itz 2 , ...) par Tj seraient du nidme 
type et devraient avoir le inline ordre cyclique relatif sur les deux 
c 6 tds. Mais cet ordre determine les coefficients de rotation, qui 
seraient alors 4gaux. 

Cette propri 6 t<$ nous montre la nature assez compliqude dc la 
courbe C r /. 

Nous ne pousserons pas plus loin l’&ude des propriety dc C,/. 

12. Une remarque historique. — J ai observe en 1916 que les 
transformations conservatives associ^cs aux systdmes dynamiques 
& deux degrds de liberty devient les directions vers la gauche 011 
vers la droite de la direction radiale dans le voisinage d’un point 
invariant de type stable. Je me suis alors demandc ce que Ton pour- 
rail dire pour une transformation conservative T d’une couronne 
en elle-mdme possedant cette propri£t 6 , et j’avais imaging pendant 
quelque temps qu’au lieu des deux points invariants qui existent 
d apres le dernier tli 6 oreme dcg<$om 6 trie de Poincare, on pourrait 
‘Ians ce cas obtenir toute une courbe invarianle. Pour essayer de 
deinontrer cette conjecture en 111 'appuyant sur l’existence, presque 
certaine, de regions annulaircs d’instnbilild, j’ai eludio la transfor- 


442 



— 26 — 

matum auxiliaire, T* employee plus haut.. J’avais mdme cru avoir 
prouvd cette hypothyse par la mdthode de reductio ad absurdum 
on obtenant comme autre possibility une courbe invariante C*<avec 
deux coefficients in^gaux de rotation- C’est seulement un peu plus 
tard que je me suis apergu qu’une telle courbe invariante pourrait 
bien exister. 

Plus r^cemment j’ai dymontry l'existence des rygions annulaires 
d’inslability (voir mon article dans les Annales de Vlnstitut 
Henri Poincart), et cela m’a permis, en suivant le m^me ordre 
d’idyes, de dymontrer l’existence de cette catygorie remarquablc 
de courbes fermdes. 


443 



Reprinted from Annales de VInstitute Henri Poincare , 1932, Vol. 2, 
pp. 369-386. 


Sur I'existence de regions d’instabilitd en Dynamique 

PAR 

George D. BIRKHOFF 


!• — Soit un systtme dynamique k deux degres de liberte, et con- 
sideions les mouvements qui correspondent k une valeur donnee 
de l’energie totale. Dans ce cas on peut ecrire les equations diffe- 
rentielles du mouvement sous la forme suivante : 

(i) = — dH( ^ g- dq _ t) 

V dt d q dt ~ 6p 

En particulier dans le voisinage immediat d'un mouvement perio¬ 
dique, on peut considerer la variable independante t comme coor- 
donnee angulaire de periode 27r, et H comme fonction periodique 
par rapport k cette variable, le mouvement periodique lui-meme 
correspondant a la trajectoire p = q = o dans l'espace des variables 

P. q. '• 


2. — Selon une methode employee dans les travaux de Poin¬ 
care ( l ), Levi-Civita ( 2 ) et moi-meme ( 3 ), l'etude des mouvements 
voisins d un mouvement periodique se ram£ne a l'etude d'une trans¬ 
formation ponctuelle T du plan. Designons par 

P'Poy q„, 0 et q;p m% q ny t\ 


(1) V’oir les Mdthodes Nouvellts de la M/camque Celeste, t. III. 

(2) Voir, par exemplc, son memoir? Sopra alcuni criteri dt instability, Annalt dt Mateinatica. 
ser. Ill, t. V, iooi. 

( 3 > Sur late Transformations and their dynamical applications , Acta Malhematica, t. XLIII 
(1917;- Nous remarquerons que presque tous les raisonnements contcnus dans ce mlmoire, restent 
valablcs dans le cas oil T s'exprime par des fonctions continues ayant cgalement toutes leure d£- 
rivees continues. 



444 



GEORGE D. BIRKHOFF 


les coordonnees p, q du mouvement tels que pour l = o p et q se 
reduisent respectivement a p 0 et q 0 . Si H est une fonction analy- 
tique, ces deux fonctions p et q seront egalement des fonctions ana- 
lytiques en p„, q 0 , t. 

Apr^s un intervals de temps 2 tt, le point variable se trouvera de 
nouveau dans le plan t = o (*) avec des valeurs de p et de q egales a 

$ P, = pip., q u , 2 ”) = ?(/.„, q .), 

( 9. = 9<P 0 , 27t) = g mK 

De cette manure. la transformation, T de p, q en p„ q t sera donnee 
par : 

( z > T: P>=*iP,<n, 9,=Wp. q). 

On demontre immediatement que cette transformation est directe 
bi-umvoque et analytique dans le voisinage du point invariant p = o 
q = °. qm correspond au mouvement periodique donne. et, que de 
plus, elle conserve les aires : 


(3) 


*P, *?i _ •>£, *11 ~ 

*q dp ,>q — x * 


3. D autre part, etant donne une transformation T de cette 
esp£ce, ll existe toujours des syst£mes dynamiques correspondants 
de la forme (i) pour lesquels H est une fonction de classe C (*) 
sinon analytique ( 3 ). «> \ /» 


4. — Supposons maintenant que le mouvement periodique 

p = q = o soit du type stable general. En ce cas 1 equation carac- 
tenstique 

(4) >•* - ( + aijp.o} )} ^ I = o 

aura deux racines imaginaires 

X' ® e 9 ''~\ X' = e~ 5 v— 


comme confondus. 


ill p, OU " re f. ardo " s ,es P° inls (P> 9, I + 2k S), k = o, * X, ± 2 , 

(2) Cest-i-dire, H et toutes ses derivees partielles sont continues. 

° n the Dynamical Role of Po * nca re’s Last Geometric Theorem 


— 370 — 


445 



sun i/EXISTENCE DE REGIONS D'lNSTABIEITfe EN DYNAMIQUE 

telles que le rapport ^ soit irrationnel. Les series qui expriment 

formellement les coordonnees p et q en fonction de t sont alors du 
type trigonometrique. 

Au moyen de la transformation T, la question fondamentale de 
la stabilite se pose de la manure suivante : Repetons indefiniment 
la transformation T (ou T- 1 ) et considerons les images successives 
d’un point quelconque P situe k une distance plus petite que & du 
point invariant (o,o). Est-il toujours possible de choisir & suffisamment 
petit pour que toutes ces images restent k une distance moindre 
que e < o, de ce point, t etant im nombre donne, et arbitrairement 
petit ? S’il en est ainsi, on aura stabilite au sens strict du mot. Jus- 
qu'ici on n’a pas pu resoudre ce probteme difficile dans toute sa 
generality. 

Comme le remarquait Poincare (*), pour que dans un cas donne 
il y ait stabilite il faut et il suffit qu'il existe des courbes ( a ) 
invariantes, arbitrairement petites, autour du point invariant. 
Dans ce cas on peut evidemment trouver une suite infinie de 
courbes A. convergeant vers ce point, avec / n+1 dans l’interieur 

de /„ pour n = I, 

5. — J’ai demontre (voir Particle dejk cite des Acta Mathema- 
tica), qu'on peut toujours reduire la transformation T, k une forme 
normale par l'emploi de series formelles : 

\p x = p cos (a -f- c(p* -f- ^ 2 )"‘ — q sin (a -4- c{p 2 
' ^ =r £ sin (» -4- c(p* -+- q 2 )" +- q cos (a -4- c(p 2 4- ^ 2 ) m ), 

ou, en general, m = I, c ^ o. 

Dans le cas integrable, les series p, q sont convergentes, et il existe 
une famille analytique de courbes invariantes /, k savoir, les courbes 
f> % -f q 2 = const. Voilk done un cas simple ou il y a stabilite stricte. 

0. — J’ai etudie egalement la forme des courbes invariantes dans 
le cas general non integrable, sous la seule hypoth£se c ;zf o. Si l’on 
emploie des variables convenables p, q, il existe un cercle ayant son 


(1) Mtthodes NouvelUs de la Mieanique CtUste, t. Ill, pp. 149-151. 

(2) C’cst- 4 -dire fronticres d'une region ouvertc d’un »eul tenant. 

— 371 — 


446 



GEORGE D. BIRKHOFF 


centre k l’origme, pour lequel la transformation T fait toumer toute 
direction radiale h gauche ou a droite de la nouvelle direction radiale 
suivant que c > o ou c < o, tandis que la transformation inverse 
T-* fait toumer ces directions dans le sens oppose. En ne consi¬ 
der*^ que l’interieur d'un tel cercle j'ai demontre, entre autres, 
les faits suivants : 

(1) Toute courbe invariante f est de la forme r = /( 6 ) > o (r, 6 etant 
des coordonnees polaires), ou f(6) est une fonction continue et 
periodique de periode 2 n, pour laquelle le rapport 

/«.)-/(».) 

«. - 9 , 

est nniformement borne. 

( 2 ) A chaque courbe invariante correspond un coefficient de rota¬ 
tion qui donne en quelque sorte l'accroissement moyen que subit la 
variable angulaire 5 des points (r, 6) de cette courbe, sous l’influence de 
la transformation T (*). En supposant c > o ( a ) la courbe invariante /, 
contiendra la courbe invariante /, dans son interieur si l’on a r, > t„ 
et inversement si f, se trouve k l’interieur de /, on aura r x > r a . 

( 3 ) Le meme coefficient me peut pas appartenir k plus d’une courbe /, 

sauf dans le cas r = 2 m- (m, n etant entiers). Dans ce cas toutes 

les courbes correspondant k cette valeur de r ont necessairement 
im ou plusieurs points communs ; ces points communs seront des 
points invariants pour T" tels que l’accroissement de 0 soit egal 
k 2 tnn. 

(4) (’) Chaque courbe / qui appartient k une valeur r = 2 «” 

est soit une courbe analytique, soit une courbe composee d’un nombre 
fini d'arcs analytiques, sauf peut-etre k leur extremite ou Us restent 
de classe C„. Ces extremites sont des points invariants du type ins¬ 
table, pour la transformation T\ et les arcs correspondants sont 
asymptotiques k ces points. 


.oi , iour» U «,« e ^ n * nt ’‘ P,i ! ■ i, ‘” ,i ° ns s “ cc ” sivcs d « T la valour de ce. accoissemen, se trouve 
t°UjOur S entre nx — 2 tt e t nx + 2 * selon les rtsultats de Poincar*. 

J. ) de e " e gn h ^Lr". reS,re "" PaS ' a 8<n " aH,< Pa ' CC '« d ' * P-» T e. T- 

midifit° rSqUe T CSt dC U daSSC C * * maiS n ° n ana, - Vtique ’ ,e r«ultat( 4 ) doit etre convenablement 


— 372 — 


447 



SUR i/EXISTENCE DE REGIONS D’lNSTABILITfe EN DYNAMIQUE 


(5) La serie des courbes / et des coefficients r est ferraee et contient 
toujours la courbe invariante r = o avec le coefficient 7 = c. 

7 . — On voit done que plusieurs cas peuvent se presenter : a) il 
n’existe pas de courbes / autre que r = o ; b) il en existe et elles corres¬ 
pondent a toutes les valeurs de 7 d'un certain intervalle (7, y.) ; et 
enfin c) il existe des courbes /, mais elles ne correspondent pas k toutes 
les valeurs de cet intervalle. 

Dans le premier cas nous avons une region d'instabilite autour 
du point invariant, qui est proprement instable selon le entire de 
PoiNCARfe. Dans le troisieme cas nous aurons des regions annulaires 
d’instabilite entre deux courbes invariantes /, et / 2 , telles qu’il n'existe 
pas de valeur de 7 entre r l et r 2 ; souvenons-nous du fait que l’en- 
semble- 7 est ferme. Ces regions annulaires d’instabilite poss£dent 
des proprietes remarquables qui ressemblent beaucoup k celles des 
regions d’instabilite du premier cas. En particular on peut trouver 
des points dans le voisinage de n’importe quel endroit d’un des deux 
bords / de la region annulaire, tels que quelques-imes de leurs images, 
par T ou T“\ se trouvent dans le voisinage immediat de Vautre 
bord. 

Jusqu'ici on n a jamais dtmontri Vexistence de telles regions annu¬ 
laires d’instabiliU. Le principal but de cet article est de le dimontrer 
dans I'hypoth&se que T est analytique et que H est de classe C x , sinon 
analytique. 

Si Ton pouvait aller plus loin et demontrer qu’il existe de telles 
regions annulaires qui aboutissent a l’origine (e’est-k-dire, qu’une des 
courbes f x , / 2 se reduise k la courbe r = o), le probteme de la stabi¬ 
lity serait resolu dans le sens negatif. 


8. — Pour demontrer le resultat que nous venons d’enoncer, 
nous allons considerer en premier lieu une transformation composee, 
de la forme T t . = T 0 R t . Ici T 0 designe la transformation (5) avec 
m = 1, o < c < tt ; R t est une transformation dependant d’un para- 
metre k qui pour k = o devient la transformation identique et qui, 
pour tout k, conserve les aires et admet l'origine et les points du 
cercle r = 1 comme points invariants. Nous definirons tout de suite, 
la transformation R* d'une manure tout a fait precise. 

— 373 — 


448 



GEORGE D. BIRKHOFF 


En employant les coordonnees polaires o = r 2 , 0 modifiees, la trans¬ 
formation T 0 s’ecrit 

( 6 ) ?. = co. 

La condition pour qu’une transformation quelconque, exprimee 
au moyen de ces coordonnees. conserve les aires, est que le deter¬ 
minant fonctionnel correspondant soit egal a 1*unite. 

Choisissons la constante -7 de fa<;on qu'elle soit negative mais plus 
grande algebriquement que — c. Dans ce cas la transformation T 0 
laisse invariants non seulement 1'origine, mais aussi tous les points 

du cercle o = —I dans la region circulate o < i. 

Definissons maintenant la transformation R* par les equations ( l ) 

( 7 ) = P ■+■ ?.)• q = g t h- 

ou, par exemple, 

u(p, q,) = - ( p * q• - iy(pt + gifp. 

Pour k suffisarament petit on voit que cette transformation est di- 
recte, bi-univoque et analytique, et qu’elle se reduit k la transfor¬ 
mation identique pour k = o. De plus, 1 'origine et les points du cercle 
p = i sont invariants, et les aires sont conservees pour toute valeur 
de k, comme le montre un calcul direct du determinant fonctionnel. 
Done R* a bien toutes les proprietes mentionnees plus haut. 

Si Ton exprime les variables p x , q x , en fonction de p, q, on obtient 
les series suivantes : 


(8) p x =/>-+- k d0 [p. q) -\ -, q t = q — k --pipy q) H-. 

En fonction des coordonnees p, 5 , ces equations prennent la forme 
suivante : 


( 9 ) 


, du 


0 , =0 — 2 * 


dU 


dp 


Nous n'employerons pas ces demteres equations dans le voisinage 
immediat de 1’origine. 


(i) Pour I’emploi des equations de ce type voir Particle par E. GourSat, « Sur les transfor¬ 
mations ponctuelles qui conservent les volumes -, Bulletin ties Sciences Malhfmaliques, s6r. 3, 
t. V, 1901. 


— 374 — 


449 



SUR i/EXISTENCE DE REGIONS D’lNSTABlLITfe EN DYNAMIQUE 


9. — Considerons maintenant la transformation composee, 
T* = T 0 R t . Pour k petit il est evident que cette transformation 
poss^de les proprietes suivantes : 

(a) T* est directe, bi-univoque et analytique par rapport h p, q. 
et varie analytiquement avec le param^tre k ; 

(b) elle conserve les aires ; 

(c) elle admet 1’origine comme point invariant simple avec des 
developpements en p, q qui coincident avec ceux de T 0 , jusqu’au 
quatrteme ordre ; 

(d) le cercle f ■ I est une courbe invariante, /, pour T 4 , dont tous 
les points sont avances d’un angle z -f- c < 2 n ; 

(e) pour k = o Ti se reduit a T 0 . dont les points invariants sont le 
point invariant simple (du type stable) a l’origine et tous les points 

<lu cercle y = — * < i. 


10. — Demontrons maintenant deux autres proprietes de la 
transformation auxiliaire T* : 

(/) T x tourne les directions radiales vers la gauche de cette direc¬ 
tion, au moins pour k tr£s petit ; 

(g) dans le cercle / < i T* admet seulement deux autres points 
invariants autres que l'origine. Ces points sont simples et varient 

analytiquement avec k. Ils se reduisent a (—o), - ) (coor- 

donnees (y, e j)) pour k = o ; le premier de ces points est du type 
stable, le deuxieme du type instable. 

Pour demontrer (/) observons que pour k petit, les directions ra¬ 
diales toument de la fa^on indiquee, au moins pour les points voi- 
sins de l’origine, parce que les developpements de p x , q Xt en serie de 
p, q, ne changent pas jusqu’au quatrieme ordre et varient analvtique- 
ment avec k. D'autre part Tangle dont tourne la direction radiale 
vers la gauche est une fonction analytique de p, q (l’origine exclue), 
fonction qui est positive pour k = o sauf a l’origine, et qui reste done 
positive en dehors d’un cercle donne y = o > o pour £_suffisamment 
petit. 

Pour demontrer (g) il faut examiner les points invariants de T*. 
Evidemment de tels points ne peuvent pas exister pour k petit sauf 

— 375 — 


450 



GEORGE D. BIRKHOFF 


dans le voisinage de p = o et de o = — - qui donnent les points 

invariants de T 0 . Le point o = o est un point invariant « simple » 
de T 0 . Par consequent pour k petit tout point invariant de T* dans 
le voisinage de p = o s’obtient par la variation analytique du point 
& l'origine. Comme l’origine est un point invariant pour tout k, il 
s'ensuit que T* ne poss^de pas d’autre point invariant dans le voisinage 
de l'origine, que ce point lui-meme. 

Pour etudier les autres points invariants nous allons employer 
les coordonnees o, 0. En fonction de ces coordonnees la transforma¬ 
tion T* s’ecrit : 


(io) 

Oil 


AM* 

\ ?> =P 2 *^( 0 , 0 


cp) 


I % = 


du' 


?-*- Cp — 2k (p, »» -+- a -h Cp) -4- . 


"*•?, 0 ) = u(p. q). 

En un point invariant p l = p, 0 l = 5, nous aurons 


(ii) 


dK* A 
AO <?• ° 


+ 9 + c?) + /fA, H-- O, 


du‘ 


*~ c 9 — 0 + * + c ?) + * 2 B, H-- O, 


ou Aj, A 2 , • • •, B 1# B a , • • • sont des ionctions analytiques de p et 0, perio- 
diques, de periodes 2 n en 0 ; de plus ces series convergent uniforme- 
ment pour les valeurs dep et 0 que nous considerons ici (o<^f^l, 
S quelconque). 

Mais avec notre choix particulier de u, nous aurons 


«*(p, 6) = — (i — p)*p* cos 0. 


La premiere equation (n) nous montre done que pour toute valeur 
de p pr£s de — il existe deux valeurs correspondantes de 5 qui 
satisfont & cette equation, l’une pr£s de o, l'autre pr£s de n : 

(12) W = — <j — cp k/,lp) * -, e* = r. — <x — cp -t- p) h-. 

Ici# t\> f ’ * *» gi, g*, • • • sont analytiques en p. 

— 376 — 


451 



SUR l'existence de regions d'instability en dynamique 

En substituant ces valeurs de 6 dans la deuxteme equation (n), 
on obtient deux equations ayant les formes suivantes : 


(13) 


\ ® + c? + -+-•••= o, 

f v CZ -T- £D,(s) H-=0, 


ou C x , C 2 , • • •, D lf D 2 , — sont analytiques en /. 

De cette mani&re on obtient les valeurs correspondantes o' et 0 " 
de 0 


(14' 


+ *E, 

+ k f. 


ou les coefficients sont des constantes. 

Done pour ^ o et petit, il existe precisement deux points inva¬ 
riants autre que l'origine, qui varient analytiquement avec k, et se 

reduisent a ( —j, o) et (— rr) pour k = o. 

II reste a considerer les equations caracteristiques ( 4 ) de ces deux 
points invariants. En supposant k > o, la premiere equation qui s’ecrit 

>.* 2(1 kc^i ..!)! + ...]ui = 0| 

aura deux racines reelles < 1 . > 1 . Le point invariant corres- 

pondant est done simple et formellement instable. La forme de l'autre 
equation est la suivante : 

...]> + 1 = 0. 

Ici les deux racines sont imaginaires conjuguees. Ce point invariant 
est simple egalement mais stable au point de vue formel, au moins 
si l'on choisit k de fa^on que ces racines ne soient pas des n ,imm racines 
de 1 'unite. 

Par consequent la transformation T* ainsi definie aura toutes les 
proprietes enoncees. 


11. — Mais pour k petit’T* a au moins deux courbes invariantes, /, 
k savoir 0 = o et * = 1 avec des coefficients de rotation, <7 et <7 + c 
respectivement. Done, ou bien il existe une suite de courbes invariantes 

“ 377 — 


452 



GEORGE D. BIRKHOFF 


intermediates qui correspondent a tout l’intervalle 7 ^ r <7 -f- c, 
ou bien il existe des regions annulaires d'instabilite comine nous 
voulons le demontrer. II ne nous reste done a considerer que la pre¬ 
miere possibility. 

En ce cas il doit exister au moins une courbe invariante, f 0 , qui 
correspond a la valeur intermediate o de r, et qui contient necessai- 
rement le point invariant simple du type instable. 

Je dis qu’il ne peut pas exister une seule courbe de cette espece. 
En effet dans le cas contraire, cette courbe contiendrait deux des 
branches asymptotiques issues du point instable, et on aurait deux 
possibilites indiquees dans la figure ci-jointe. 


r=< f=i 



I, et I 2 indiquent les points invariants respectivement stable et 
instable ; la courbe asymptotique /„ est rencontree par un rayon 
quelconque issu de l'origine en un seul point, et la direction du mou- 
vement des points de cette courbe est celle indiquee par les filches • 
en effet toute direction radiale toume vers la gauche prts du point I ’ 
Mais dans le cas considere les courbes / pour la valeur limite - = o 
doivent s-approcher uniformement de /„, ce qui est impossible, parce 


et le, k ! d “ ,ai ' bien . Cunnu <>“« l« d«>>* blanches asymptotiques positives 

les deux branches asymptotiques nc C ativcs for.nent une seule courbe invariante au point I. 



453 


sur i/existence de regions d'instabieite en dynamique 

qu'une telle courbe invariante ne peut pas couper les deux branches, 
asymptotiques libres issues de I t . 

Dans ces conditions il est evident qu il ne reste qu’une possibilite: 
celle ou il y aurait deux courbes invariantes asymptotiques, f x et / 2 , 
formees par quatre branches asymptotiques issues de I 1( et qui coin¬ 
cident deux k deux. Cette possibilite est indiquee dans la figure 2 . 


f=i 



On pourrait conserver l'espoir de demontrer directement au moyen 
du calcul que cette possibilite ne se presente pas. En effet il semble 
in liniment peu probable que les quatre branches coincident de la 
mantere indiquee plus haut. Neanmoins ce calcul ne parait pas etre 
simple et je prefere eviter la difficult^ comme je l’indique dans le 
paragraphe suivant. 

12. — Nous pouvons done admettre provisoirement que T* 
poss£de deux courbes invariantes de la forme indiquee dans la figure 2 . 

Considerons alors la transformation composee T£ = T*S, ou S 
est definie comme la transformation identique en dehors d’un petit 
cercle y autour d'un point P sur la branche exterieure asymptotique 
(voir la figure). A l’interieur de 7 , S est une rotation autour du centre 
P d'un angle variable mais petit qui se reduit k o au centre et sur la 

— 379 — 


454 


GEORGE D. B1RKHOFF 


-circonference. Evidemment on peut choisir la transformation S, de 
la classe C„, de sorte que S toumera les directions radiales vers la 
gauche d'un angle aussi petit qu'on le voudra. De plus S conservera 
les aires. La transformation composee T* jouit des proprietes ana- 
logues. 

Mais une teUe transformation T* de classe C„ posskde toujours 
des branches asymptotiques de classe C„, comme dans le cas analy- 
tique, avec la modification evidente que ces branches sont de classe 
C„ mais non necessai remen t analytiques (•). Dans le cas considere on 
peut meme trouver ces branches directement. En effet ces branches sont 
les memes pour T* que pour T au voisinage du point invariant I,. En 
iterant successivement T\ la partie superieure de la branche exterieure 
(voir la figure) s’etend de la meme maniere que celle de T*, au moins 
jusqu’au point A ou cette branche rencontre le cercle y. Or, si l'on 
repute T* encore une fois, on commence par la transformation T. 
qui etend cette partie jusqua A„ et on fait ensuite la transformation S, 
qui modifie l’arc AB interieur au cercle en le changeant en un autre 
qui coupe AB une fois seulement (voir la figure). 

D'autre part, par iteration successive de T*- 1 = S-‘T,- 1 , on peut 
etendre la branche inferieure de la meme manikre jusqu'au point B 
(voir la figure) ou cette branche rencontre y. Si maintenant on 
r6p£te T*-* encore une fois, on commence avec R- 1 qui ne modifie 
pas la partie de la branche asymptotique deji obtenue, et on fait alors 
la transformation T -1 qui letend jusqu’au point B_, = T-‘(B) 
sur la meme courbe asymptotique que celle de T. 

Done les deux branches asymptotiques intirieures de T* se coupent 
au point P. 

13. — Mais cette transformation T* aurait alors toutes les pro¬ 
prietes enoncees de T avec la seule modification qu’il faudrait rem- 
placer la condition d’etre analytique par celle d’etre de classe C^. 
De plus, les branches asymptotiques ne coincident pas, deux k deux, 
en ce cas. 

Done il existe des regions annulaires d’instability pour de telles trans¬ 
formations T* de classe C„, sinon analytiques. 


(§§33-40 P11,le ,CI SCUl ' me,U d U " P 0 '"' sim P le du ,V P* instable. Voir mon article df]a citi 



455 



SUR l’existence de regions d’instabilite en dynamique 


14. — Nous allons maintenant demontrer que la meme conclu¬ 
sion subsiste pour des transformations analytiques. 

Remarquons que nous pouvons construire des transformations 
T 0 ,i, R*,i, S, telles que : 

(a) pour t = o ces transformations se reduisent k la transforma¬ 
tion identique et pour t = 2 n a. T 0 , R* et S respectivement ; 

( b) elles sont directes, bi-univoques, de classe C w en p, q, t, et meme 
analytiques en t, et conservent les aires pour tout k ; 

(c) elles laissent invariants Torigine et le cercle p = i quel que 
soit k. 

En effet nous pouvons definir T 0 ,i, comme la transformation sui- 
vante : 


P. =“ ?. = 0 -+• (« 


qui poss£de evidemment les trois proprietes enoncees. D’une mantere 
analogue nous pouvons definir R* par les equations (i) leg^rement 
modifiees : 


/>, 


P 2 -k dq * 


9 = 9x •+■ 


ki # L ) * 

2r. ' bp 


avec le meme choix de la fonction u. Quant a S 4 , nous la definirons 
de la meme fa$on que S, par une rotation dans le cercle y diminuee 
dans le rapport l : 2 z. 

La transformation composee T,' = T 0il R ti< S* jouira elle aussi 
des trois proprietes (a), (b), (c), etendues a T 

Employons maintenant une representation geometrique ; consi- 
derons les variables p, q, t, comme coordonnees rectangulaires d'un 
point dans l'espace. 

Lorsque t varie de o a 2 ?:, les points [p, q, o) du plan t = 2n 
peuvent se deplacer [en (p lf q l$ t), ou (p x , q x ) est l’image de (p, q) 
par T*. On voit alors que chaque point decrit une trajectoire qui 
commence au point (p, q, o) et se termine au point (p lt q lt 2 n) ou (fi lt q J 
s'obtient de (J>, q) au moyen de la transformation T*. La totality 
de ces regions remplit la region cylindrique 6 <£ I entre les deux 
plans t = o et t = 2 tt, et la direction de la trajectoire unique qui passe 

— 381 — 

ANNALES DE L’InSTITUT H. PoiNCARt. 


456 



GEORGE D. BIRKHOFF 


par chaque poinc (p, q, t) s’exprime par des equations diflerer tielles 
( J 5 ) ^ 9 , 1 ), % = Mp. 9 . 0 , 

Oil <p el 'l son* de clas-e C w par rapport a p, q, et t, mais non aecessai- 
rement periodiques, de periode 2rr, en /. Evidemment l’axe des t est 
une de ces trajectoires, /' pour z,(p, q, o) = ^{p, q, o) = o. 

Comme la transformation T,* conserve les aires pour toute valeur 
de t, il est evident que toute region tubulaire des trajectoires avec 
une base d? dans le plan t = o coupera tout autre plan t = const., 
avec une aire toujours egale & dc. Done le volume d’un cylindre 
elemental re de base d <7 et de hauteur dt sera toujours dzdt. II s’ensuit 
que les volumes sont conserves par le mouvement fluide defini par 
les equations differentielles (5), e'est-k-dire par 

“). jj = typ t q, -) t j = j. 

Selon la r£gle habituelle nous aurons done 


(Ifi> %(P, 9 . <> - = o. 

Mais 1 ’equation (16) montre qu’il y a une fonction H (p, q, t), de 
classe C,, en p, ,q t, et entitlement determinee par (16) si on ajoute 
la condition H (p, q, o) = o. telle que 


t(P, q,l) = - 


dH( p % q. /) 
»q 


VP. q, n = >- H '* *•V 


Done les equations (15) 

dp — __ dH 
dt dq 


sont de.forme hamiltonienne 
(P, q. 0 , fl = 0 . 


ou H n'est pas periodique par rapport a t. 

Nous essaierons maintenant de trouver une autre fonction H(p, q, t), 
analytique en p, q t t_ qui soit k peu pr£s egale a H (p, q, t), et telle que 
la transformation T analytique correspondante ait aussi les autres 
proprietes de T* [qui [sont essentielles pour le but que nous avons 
en vue. 



457 



SUR L’EXISTENCE DE REGIONS D’lNSTABILITE EN DYNAMIQUE 


15. — Ces proprietes sont essentiellement les suivantes : 

(а) Le cercle o = i est une courbe invariante de T ; 

(б) Le point f> = o est un point invariant stable pour T, avec les 
memes developpements de p, q, en fonction de p, q jusqu'au quatrteme 
ordre. 

En effet, une telle transformation T aura deux points invariants 
simples voisins de ceux de T* avec des branches asymptotiques k 
peu pr£s les memes, qui par consequent se couperont. D’autre part 
les directions radiales seront tournees vers la gauche par T*. soit dans 
le voisinage du point invariant (k cause de la propriete (6)), soit k 
une distance considerable de ce point (k cause du fait que T differe 
peu de T*). 

Pour choisir une telle fonction H, remarquons que 


,dH <jH 


= o 


sur le cylindre o = i. En effet la relation p 2 -}- q 2 = i k un instant 
quelconque entraine cette relation pour tout t. Done 


d 

dl 


(/>’ h - f 




Mais cette relation demande que H se reduise sur le cylindre k 
une fonction de t seulement, de sorte que nous pouvons ecrire 


H ip. if, 1 1 = Hi l, o. t) -+- (/>* -4- q‘ — l»J(/>, q , /), 

ou J (P> 0 est de classe C^. De plus, le developpement de J (p, q, t) 

jusqu’aux termes du quatridme ordre determine celui de H : nous 
remarquons que J aussi bien que H est analytique sauf sur le cercle y. 
De plus, puisque p = q = o est une trajectoire, les derivees par- 
tielles de H du premier ordre par rapport a p et q s’evanouissent 
identiquement, et la meme propriete subsiste pour J. 

Choisissons main tenant (s’il est possible) une fonction J, diffe- 
rant peu de J, analytique en p, q, t, et qui admette le meme deve¬ 
loppement que J jusqu’aux termes du quatri£me ordre. Conside- 
rons la fonction H correspondante. 


H = H(o. i ? t) ( p- q*, — I *Jtp t q , /) 



458 



GEORGE D. BIRKHOFF 


et les equations hamiltoniennes qu'elles definissent. On voit tout 

de suite que la transformation T correspondante aura toutes les pro- 
prietes demandees, en particulier (a) et ( b ). 

En admettant pour le moment le fait presque evident qu'on peut 
choisir une telle fonction J, nous voyons qu’il exists des regions annu- 
laires d'instability pour les transformations analytiques. 

Pour eviter des difficulties dans cette manure de raisonner, nous 
supposons que toutes les derivees partielles de J jusqu'au cinquifeme 
ordre different peu des derivees correspondantes de J ; nous demon- 
trerons la possibility d’un tel choix plus tard (paragraphe 7). fivi- 
demment, dans ces conditions une direction radiale quelconque 
subira une rotation vers la gauche pour T comme pour T. 

16 Dans^les equations hamiltoniennes qui correspondent a 
cette fonction H, avec o ^ t g 2-. nous pouvons etendre la definition 

de H, de fa?on ii la rendre periodique. Naturellement cette fonction H 
nest pas toujours analytique ni meme continue pour t = o,± 2-, ■ ■ ■. 

Supposons maintenant que nous effectuions dans la direction de 
l'axe des t, la deformation indiquee par [equation 

‘ = /. t). 

La fonction inverse /->(/; est ici de classe C„, et croit de o a 2* 
avec<, de fa 9 on que JJ soit positive sauf pour / = o et t = 2r. quand 

toutes les derivees s’annulent a la fois. Apr*s cette deformation, les 
trajectoires modifiees auront leurs directions parables a l’axe des t 
sur les deux plans < = oet< = 2*. et on voit que les trajectoires 
sont de classe C„ partout. Avec cette nouvelle variable independante 
les equations differentielles maintiennent leur forme hamiltonienne 
avec une nouvelle fonction 

H s H* 
it 

laquelle est evidemment de classe C„ en p, q, t, de periode 2- en i, 
et analytique sauf pour t = o, =t an. •.. si nous avons choisi une fonc- 
ti°n W a > ,ant les memes proprietes. Ce changement de variable 

— 384 — 


459 



sur l'existence de regions d’instabilite en dynamique 

independante ne modifie pas la transformation T associee aux equa¬ 
tions differentielles initiales. 

Done il exisle des regions annulaires d'instability pour des sysUmes 
dynamiques (i), avec H de classe et meme analytique, sauf peul-ctre 
pour t = o, dr 27 :, —, et avec T analytique . • 

A premiere vue on pourrait croire qu’ime petite modification 
supplemental permettrait de trouver une fonction H analytique 
partout. Mais il y a k cela une difficult^ qui tient au fait que le deve- 
loppement de H en fonction des variables p, q, con tient des coefficients 
non analytiques qui doivent etre modifies. Neanmoins je crois que 
cette methode peut etre effectivement appliquee, et, par consequent, 
qu'on peut trouver une fonction H qui soit partout analytique. Mais 
je ne l’ai pas encore demontre. 

En tout cas, au point de vue des applications, e'est le cas d’une 
fonction H de classe qui est interessant. 

17 . — Pour completer notre raisonnement il suffit de demontrer 
le simple lemme qui suit : 

Soit une fonction 

/(*!,•", x R , /), 

•de classe C w pour 

a < r, ^ b, (1 = 1,..., w), •— o < / < 2 t: +- 8 , 

laquelle s'evanouit ainsi que toutes ses derivees partielles par rapport 
&*„•••, x 2 jusqu’au k time ordre pour x x = x 2 = • • • = x„ = o ; on peut 
alors trouver une fonction g(x x , • • •, x n , t) analytique en x lt • • •, x n , t 
avec la meme propriete, telle que / — g et toutes ses derivees par- 
tielles jusqu’au k lh,t ordre soient aussi petites que l’on voudra dans 
cette region. 

En effet si on soustrait de J le polynome P qui donne son deve- 
loppement en p, q jusqu’au cinqui£me ordre, on obtiendra une fonc¬ 
tion J* a laquelle on pourra appliquer le lemme precedent. L’approxi- 
mation analytique K de J* ainsi obtenue (avec n = 3, k = 5) nous 
donne l'approximation K -f P de J que nous cherchons. 

Il reste k demontrer le lemme. 

Observons que le lemme est vrai pour n = o, parce qu'on peut 
trouver une fonction g(t) analytique et qui diffkre aussi peu qu'on 

- 385 - 


460 



GEORGE D. BIRKHOFF 


veut d'une fonction donnee f(t) de classe C„ ; le lemme est vrai aussi 
pour k = o. 

Si done le lemme nest pas vrai pour tout n, il existe un n > o 

plus petit et un k > o plus petit pour lequel il nest pas vrai. Mais 
pour une telle fonction / nous pouvons ecrire 

/( i. • ••»**» 0 = * • •• *•-!, o, /) -f- x n / t (x n • • •, x n , t) y 

ou le premier terme a gauche ne contient que n — i variables *, 
tandis que la deuxteme terme definit une fonction f x (x xr • , x H ,t) 
de classe dont toutes les derivees partielles par rapport k x Xt * • •, x n 
pour x x = x 2 = • • • = x n = o s’evanouissent jusqu’au (k — ordre. 
Done par deux applications successives du lemme k des cas dej*i 
etablis on peut trouver une valeur approximative de f(x lt • • •, x m - lt o, t) 
et une valeur approximative de /,(*„ •••,*., t) qui nous conduisent 
i la valeur approximative de /(*,.•••. t) que nous cherchons. 


Conference faite 4 I’Institut Henri-Poncafe. le Mai i W i. 



461 



Reprinted from the Proceeding of the National Acadsky or Scibkcbs. 

Vol 18. No. 3. pp. 27*-282 M»rcb. 1932. 

RECENT CONTRIBUTIONS TO THE ERGODIC THEORY 
By G. D. Birkhofp and B. O. Koopman 
Department op Mathematics. Harvard University 
Communicated February 13 , 1932 

I. The Original Formulations. —Early in the kinetic theory of matter 
it was recognized that the replacement of mean values in time by mean 
values in phase-space is an indispensable operation. In order to justify 
this substitution, recourse was had by Boltzmann and Maxwell to their 
celebrated Ergodic Hypothesis, which states that, in the systems considered 
in the kinetic theory, each particular motion will pass, sooner or later, 
through every state of motion, or phase (combination of position and 
velocity), consistent with its energy; and (implicitly) that the length of 
time during which it exhibits a given set of phases is proportional to the 
relative abundance of the latter. In other words, each path-curve in phase- 
space passes through every point of its energy hypersurface, remaining 
in a given region of the latter for a length of time proportional to the 
extent (hypervolume) of the region. It was recognized by Boltzmann and 
Maxwell that account must be taken of all possible single-valued integrals 
of ordinary type, thus reducing the dimensionality of the phase-space. 

The extreme unlikelihood of the Ergodic Hypothesis, owing to the 
presence of special periodic motions, was pointed out by Kelvin. 1 Its 
mathematical impossibility was apparent to Poincard, 2 who indicated 
the only possible direction of effective modification, namely, the later 
Quasi-Ergodic Hypothesis of P. and T. Ehrenfest: 1 

Let ft be the hypersurface in phase-space corresponding to the given 
value of the energy of the system, and suppose its extent to be finite. 
Then each non-exceptional path-curve in ft passes through every region of 
ft of positive volume, remaining there for an average time equal to the 
ratio of this volume to that of ft. 

Furthermore, P. and T. Ehrenfest 4 observed that even if the system is 
quasi-ergodic in the sense that everywhere dense path-curves exist, the 
mean time of sojourn along the path-curve through P in a given region M 
may vary discontinuously from path to path. 

For the detailed history of the qualifications and speculations, the reader 
is referred to the "Encyklopadie der Mathematischen Wissenschaften.” 6 

II. Introduction of the Modern Theory of Real Variables into Dynamics .— 
In his discussion of recurrent motion, H. Poincare 6 introduces the funda¬ 
mental notion of a dynamical property which, without being true for all 
possible motions, has a probability of one of being realized. Poincar£ 
wrote before Lebesgue's great work, but the very steps of his proof, as 
well as the formulation of his theorem, are all in almost an exact form for 


462 



280 


MA THEM A TICS: BIRKHOFF A ND KOOFMA N Proc. N. A. S 


interpretation in terms of the theory of measure. Such an interpretation 
was accomplished by C. Carath^odory, 7 who renders "exception of prob¬ 
ability zero" as "exceptions forming in ft a set of measure zero;" and such 
is the first entrance into the realm of dynamics of the modern theory of 
real variables. 

Shortly afterward G. D. BirkhofF 8 conjectured that in systems having 
everywhere dense path curves the exceptional path curves (i.e., those not 
everywhere dense in ft) are of Lebesgue measure zero—a conjecture made 
also by A. Smekal 9 in connection with the Quasi-Ergodic Hypothesis. 

A few years later, the notion of metrical transitivity was introduced in 
another connection by G. D. BirkhofF and P. A. Smith. 10 If (P —► P,) 
is a continuous one-parameter group of automorphisms of the metrical 
space ft, it is said to be metrically transitive if and only if ft cannot be 
decomposed into two subsets each of measure greater than zero, and each 
invariant under every transformation of the group. 

The fundamental rdle to be played by this concept in the ergodic theory 
will become evident by the developments described below. 

A further application of modern analysis to dynamics has recently 
been made by B. O. Koopman;" the transformation-group (P —► P,) 
in ft is interpreted as a linear functional operator (the £/,-operator: U,f(P) 
” f(Pi )) which, when regarded in the space £ 2 of Lcbesgue-measurable 
functions of summable square, becomes unitary in the case where (P —► 
P,) lias a positive integral invariant. This £/,-operator is then studied 
by means of its spectral resolution £(X). In this study the notion of non- 
integrability in £* is used, which is equivalent to metrical transitivity 
when the measure of ft is finite. 1 * 

All the results described in this section concern properties true "almost 
everywhere" in the sense of Lebesgue measure. 

III. The Mean Ergodic Theorem .—The first one actually to establish 
a general theorem bearing fundamentally on the Quasi-Ergodic Hypothesis 
was J. v. Neumann, 13 who, with the aid of the above theory of the U,- 
operator, proved what we will call the Mean Ergodic Theorem, to the 
following effect: 

Under the above hypotheses, if Z a Q {P; M) denotes the relative so¬ 
journ of the moving point P,(P 0 = P) in M in the time interval a t ^ & 
(i.e., it is the length of this sojourn divided by 0 — a), then, as /3 — « 
—► + co, M remaining fixed, Z a0 ( P; M) converges in the mean to a 
limit Z{P; M) (a limiting process involving neighboring motions). Further¬ 
more, Z(P; M) is independent of P if and only if the system is non-in teg- 
rable in £ 2 , 12 in which case Z{P; M) = measure of jl//measure of ft, 
when ft is of finite measure, and = zero otherwise. When the system is 
integrable in V 2 , Z(P; M) is given a simple expression in terms of general¬ 
ized derivatives. 


463 



Vol. 18, 1932 .IfA THEM A TICS: BIRKIIOFF A HD KOOPMA N 281 

This theorem is important not merely as the first general mathematically 
rigorous treatment of the question, but because it is sufficient for the 
needs of the kinetic theory (if metrical transitivity is granted): From the 
standpoint of statistical mechanics, what is needed, as v. Neumann has 
observed, 14 is the knowledge that, statistically speaking, time-means of 
function on 12 can be replaced by space-means in 12 ; and the convergence 
in the mean established by v. Neumann is the precise statement that the 
dispersion or "standard deviation" of the former from the latter vanishes 
when the time-interval is sufficiently large. Furthermore, v. Neumann’s 
method of proof provides a means of computing a lower limit for the 
interval of time necessary to make the dispersion less than a given e > 0. 

As subsequent developments along these lines, we mention the simplified 
treatment given by E. Hopf, u who makes no use of the spectral resolution 
£(X), only employing the simpler properties of the unitary operator U,. 

IV. The Ergodic Theorem. —On October 22, 1931, Mr. v. Neumann 
communicated personally to the authors of this note his results on the 
Mean Ergodic Theorem. As Mr. v. Neumann pointed out then, this 
positive theorem raised at once the important question as to whether 
or not ordinary time means exist along the individual path-curves (point 
convergence of Z a0 (P; M)) excepting for a possible set of Lebesgue 
measure zero. Snickal 1 * had inclined to the opinion that this is probably 
not the case. 

Shortly thereafter, G. D. BirkhofT 17 obtained by entirely new methods 
the following result, which we will call the Ergodic Theorem: 

Under the same conditions as before, Z a0 (P; M) —► Z(P; M) 
as 0 — a —► 4- ®, for all P of 12 except for a set of measure zero. 

With regard to the scope of this theorem, we may make the following 
remarks: 

1. From the point of view of the gross statistics on 12 (classical kinetic 
theory), it is equivalent n its implications to the Mean Ergodic Theorem. 

2. From the viewpoint of the detailed statistics along an individual 
path-curve, it is fundamentally more far-reaching: in it is proved for the 
first time that the relative time of sojourn along almost every individual 
path-curve exists, a result often assumed implicitly in the writing of 
physicists, but never proved. 

3 . Whereas the Mean Ergodic Theorem belongs in its very nature 
to the theory of unitary £/,-operators in t 7 *, and had always been proved 
by this techniqi e. the Ergodic Theorem steps oi tside this domain, and was 
proved by BirkhofT with the sole use of his fundamental lemma, in which 
the barest notion of the £/,-operator is employed: 

Lemma. —Let T be a measure-preserving automorphism of the metrical 
space 12, and let f(P) be any measurable function defined on 12. Then, as 

n —► =i= co . 


464 



282 


MA THEM A TICS: BIRXHOFF A ND KOOPMA N Proc. N. A. S. 


/CP) + f(TP) + ... + f(T*~ x P) 

n 

converges almost everywhere on ft to a limit /i(P). 

It may be stated in conclusion that the outstanding unsolved problem 
in the ergodic theory is the question of the truth or falsity of metrical 
transitivity for general Hamiltonian systems. In other words, the Quasi- 
Ergodic Hypothesis has been replaced by its modern version: the Hypothe¬ 
sis of Metrical Transitivity. 

1 Collected Works. Vol. 4. 484-512. This was published first in 1891. 

• Revtu Ginhale des Sciences. 516 (1894), et. seq. The impossibility of the Ergodic 
Hypothesis was proved by Plancherel and Rosenthal. Ann. phys.. (4) 42, 796, 1061 
(1913). The Ergodic Hypothesis is called by Maxwell and his English-speaking followers 
the “Principle of the Continuity of Path.” 

3 Encykl. d. Math. Wissenschajten. 4, Art. 32; Anm. 89a and 90 (1911). The possi¬ 
bility of exceptional path curves is not taken account of here. 

4 Ibid.. Anm. 93. 

4 Ibid ., 4 , Art. 32; Nr. 10a (1911), and 5, Art. 28 (1923). 

• Les Methodes Nouvelles de la Mecamque Celeste. Paris. Gauthier-Villars (1899). 
Vol. 3. Chap. XXVI; Stability 4 la Poisson. 

7 Berlin. Ber., 580 (1919). 

•Acta Math., 43, 113 (1922). 

• Encykl. d. Math. Wissenschaften. 5, Art. 28. p. 869. Anm. 8 (1926). 

10 "Structure Analysis of Surface Transformations.” J. Math.. 7, 365 (1928). In 
certain earlier papers this is referred to under the name of “strong transitivity” (cf. 
infra, 17. 19). 

" "Hamiltonian Systems and Hilbert Space.” these Proceedings. 17,315-318 (1931). 
*' Thc system is said to be integrate in P, if and only if a function /( P) on 12 exists 
which is not almost everywhere constant and such that, for all /. and almost all P of 
U - /(P‘) " /(P)* When this is thc case, the subsets of 12 for which f(P) (g) X and for 
which f(P) (£) X will each be. for some X, subsets of 12 of positive measure which remain 
invariant under (P ► P.) for all t except for sets of measure zero (dependent on /). 
Furthermore, since/(P) is in P,. thc measure of the set where f(P) (£) > >0 (etc.), 
is finite. Thus, when 12 is of infinite measure, non-integrability in P» and metrical 
transitivity are not co-cxtensive. since the latter could occur by the decomposition of 
12 into two invariant subsets each of infinite measure. 

14 "Proof of the Quasi-Ergodic Hypothesis." these Proceedings. 18, 70-82 (1932). 

14 "Physical Applications of the Quasi-Ergodic Hypothesis,” these Proceedings. 
(March. 1932). 

14 On the Time Average Theorem in Dynamics,” these Proceedings, 18, 93-100 
(1932). 

Handbuch der Physik (H. Geiger and K. Scheel). Vol. 9. 1926. in particular p. 182 
It may be observed here that all requisite conditions (existence of time means, etc.), 
have been known to be fulfilled by conditionally periodic and other similar systems. 
In this connection see A. Wintner, these Proceedings. 18. 248-251 (Maich, 1932). 

11 "Proof of a Recurrence Theorem for Strongly Transitive Systems, and Proof of 
the Ergodic Theorem.” these Proceedings. 17, 650-6G0 (1931). 


465 



Reprinted from Annali della R. Scuola Normale Superiore di Pisa, 
1935, s. 2, Vol. 4, pp. 267-306. 


SUR LE PROBLfcME RESTREINT DES TROIS CORPS (*) 

(PREMIER MEMOIRE) 

par George D. Birkhoff (Cambridge, Mass.). 


Introduction. 

Le probldme restreint des trois corps est d’une importance de tout premier 
ordre. En effet, pour le mathSmaticien il se pr^sente comme l’exemple le plus 
caractdristique et le plus simple d’un systfcme dynamique non intygrable, tandis 
que pour l’astronome il constitue la base la plus avantageuse pour suivre effecti- 
vement le mouvement de la Lune. 

C’Stait l’astronome am^ricain G. W. Hill qui a montrd le premier non seu- 
lement son importance pour le calcul des 6ph£m6rides mais aussi son intyrfit tout 
particulier au point de vue mathlmatique. Peu de temps aprfes, le grand mathy- 
maticien Henri PoincarE commengait a dyvelopper trfcs largement les voies 
mathymatiques k peine ouvertes par Hill. Dans son oeuvre magnifique Les m4- 
thodes nouvelles de la Mtcanique Celeste , publife en 1891-1899, Poincar£ 
s'addressait tout particulifcrement k ce problfcme; il n’a jamais cess6 de s’en oc- 
cuper, comme le t^moigne son dernier article Sur un thtvrtone de Geometric , 
public dans les « Rendiconti del Circolo Matematico di Palermo » en 1912. 

Parmi les travaux mathymatiques plus Scents dans le mfime domaine, il faut 
mentionner avant tout autre ceux de Levi-Civita qui ont jety beaucoup de hi- 
midre pr^cieuse sur les questions difficiles de la r^gularisation et de la stabi¬ 
lity («). 

En 1915 il a paru dans les Rendiconti de Palermo un Mlmoire dans lequel 
j’ai ytudi6 ce probl£me (’). Beaucoup de mes recherches dynamiques ultyrieures 
ont yty vou^es k l’ytude des systfemes dynamiques plus gynyraux k deux degrys 


0) Deux conferences (sites ft la R. Scuola Normale Superiore di Pisa les 18 et 19 Juin 1934. 

(*) Voir ses M4moires (I): Sopra alcuni criteri di instability Annali di Matem., s$r. Ill, 
t. 5, 1901, et (II): Traiettorie tingolari ed urii net problema ristretio dei tre eorpi, Annali 
di Matem., s*r. Ill, t. 9, 1903. 

( J ) Voir I de la liste ci-Jointe: 

(I): The Restricted Problem of Three Bodies, Rendiconti del Circolo Matematico di Pa¬ 
lermo, t. 39 (1916); (II): Dynamical Systems with Two Degrees of Freedom, Transactions 
of the American Mathematical Society, t. 18 (1917); (III): Surface Transformations and 
Their Dynamical Applications, Acta Mathematics, t. 43 (1920); (IV): Nouvelles recherches 


466 



2 


G. D. Birkhoff: Sur le probUme restreint des trois corps [208} 


de liberty. La dernifere de ceUes-ci intitule Nouvelles recherches sur les systbnes 
dynamiques vient de paraitre dans les « Meraorie della Pontificia Accademia delle 
Scienze Nuovi Lincei ». 

Mon but ici est de reprendre encore une fois l’^tude du probl&me restreint 
des trois corps, en utilisant librement les r^sultats de mon M€raoire de 1915 aussi 
bien que tous mes r€sultats g6n6raux, particuli&rement ceux de mon M^moire 
Pontifical. NSanmoins je chercherai autant que possible d’expliquer les r^sultats 
dont je fais emploi. 

La premidre partie est consacr€e k l’examen des propri€t6s analytiques de la 
surface de section S, t et de la transformation fondamentale correspondante T que 
j’avais employee en 1915. C’est PoincarE qui a fait le premier la reduction effec¬ 
tive du problfcme restreint des trois corps au probl&me d’une transformation d’une 
surface de section en elle-meme, mais sans la dgvelopper; ce type de reduction 
est d’une importance thSorique tout k fait capitale. La reduction particulifcre dont 
je fais usage est d’une nature plus simple et plus intuitive que celle de Poincar 6. 
J’avais signal^ que la transformation correspondante est le produit de deux 
transformations involutives; ce fait joue un r61e important ici. 

Les m^thodes de la deuxidme partie sont qualitatives plutdt qu’analytiques. 
En faisant usage de faits analytiques et de mes rlsultats g€n£raux antdrieurs 
j’obtiens beaucoup de nouveaux renseignements concernant les types de mou- 
vement qui existent et leurs relations entre eux. 

Dans la troisi&me partie je consid&re quelques questions. g£n6rales d’un grand 
int6r6t pour la dynamique th£orique: le r61e du Calcul des Variations dans ce 
domaine, en particular l’applicabilitS des importants r^sultats de M. Tonelli et 
de M. Morse, les rlsultats ergodiques r^cents, et le principe de termination natu- 
relle des branches pSriodiques enoncS par M. E. Stromgren et 6tudi 6 par 
M. WlNTNER. 

Peut-fitre le but final de la Dynamique thlorique est de faire « l’intdgration 
logique ». A vrai dire, le dgveloppement caractgristique d’une thGorie math&natique 
semble Stre achevg quand nous pouvons passer librement de la forme purement 
quantitative k la forme purement qualitative et inversement. En faisant ainsi, nous 
en obtenons toute la structure logique. Dans le cas des systfcmes dynamiques, 
nous partons de la forme quantitative (le systdme donn6 des Equations difteren- 
tielles), et nous cherchons k determiner les propriety qualitatives des mouvements 
et leurs relations (les invariantes du groupe topologique). 

Jusqu’a quel point peut on regarder ce but final comme realise pour les sy- 
stfcmes dynamiques k deux degrgs de liberty, par exemple celui du probl&me restreint 
des trois corps ? 


sur les systbnes dynamiques , Memorie della Pontificia Accademia delle Scienze Nuovi Lincei, 
a6r. Ill, t. 1 (1934); (V): Dynamical Systems, New York (1927). 


467 



[269J G. D. Birkhoff: Sur le probUme restreint des trois corps 


3 


Si je ne me trompe pas, ce but se trouve maintenant presque atteint. En effet; 
on peut opSrer avec le symbolisme qualitatif de la « signature » d’un tel systfcme 
que j’ai introduit dans mon Mgmoire Pontifical, et on peut ainsi suivre indefi- 
niment tous les raouvements. Mais le plus interessant est que le symbole employe 
est k deux dimensions, tandis que les symboles mathematiques ordinaires sont k 
une dimension seulement ( 4 ). C’est en admettant de tels symboles k deux dimen¬ 
sions comme satisfaisants k l'esprit mathematique, que Ton peut regarder « l’inte* 
gration logique » comme achev^e pour les problfcmes dynamiques tels que le pro- 
bldme restreint des trois corps. 


I. 

Partie analytique. 


1. - Les Equations du uiouvement. 

Soient S et J dejx points de masses finies qui s’attirent Tun l’autre suivant 
la loi de Newton. Supposons qu’ils se meuvent suivant des orbites circulates 
autour de leur centre de gravity, O. 

Alors, avec des unites convenables, on 
peut reprSsenter leurs masses par p 
et 1 — p respectivement, et en mdme 
temps on peut prendre leur distance 
et leur vitesse angulaire autour de O 
comme egales k 1 . 

Soit maintenant P un corps infini¬ 
tesimal qui se meut dans le plan du 
mouvement des deux corps S et J 
suivant cette loi; et soient x, y les coordonnles rectangulaires relatives de P f 
les axes des x et des y etant choisis de la mantere indiqule dans la fig. 1. Les 
equations du mouvement s’ecrivent alors 

ou t designe le temps et ou Ton a 

( 2 ) + + 

( r i — Hz—pV + y’, r.-^z-n + iy+y'). 



( 4 ) Ce n'est pas toujoure le cas; par exemple lea surfaces de Ribmann doivent itre r6- 
gard^es comme une espdce de symbole & deux dimensions. 


468 



4 


G. D. Birkhoff: Sur le probl&me restreint des trois corps [270] 
Ici nous avons pose 

9Q/dx-Q xt dQ/dy-Q v . 

Ces Equations (1) admettent l v integrate evidente de Jacobi 

< 3 > 

En general nous supposons dans ce qui suit que la constante d’integration C 
est donn<5e d avance. Cela revient k dire que nous employons l’integrale (3) pour 
abaisser effectivement l’ordre de (1) du quatridrae au troisidme. En particulier 
nous pouvons rdaliser une teUe reduction en introduisant la variable 


(4) <p —arc tan ^ 

ax 

ce qui donne le systdme reduit suivant 


(5) 


J7 ~ i^Q-C cos <p = X(x, y, <p), 
Tit — i2Q— C sin <p m Y(x, y , tp), 


dr P _o , — Q* cos <p + li w tin <p 

di + YiHe 


&(*, y, <p)- 


Remarquons le fait evident que si nous regardons x, y, <p comme les coor- 
donn^es d’un « etat de mouvement », les trajectoires rfelles correspondantes rem- 
pliront la region 

2fi-C>0, r,+ 0, r, 4= 0 

d un espace k trois dimensions avec coordonn^es rectangulaires x, y, <p. Malhcu- 
reusement les Equations (5) deviennent singulidres aux trois frontidres de cette 
region puisque les fonctions X, Y et <I> y deviennent singulidres. Nous verrons 
dans les sections suivantes comment on peut obtenir une representation geome- 
trique qui soit partout sans singularite. 

Quand le temps t croit, chaque point de cette region suivra la trajectoire corres- 
pondante. Done nous pouvons associer la totality des trajectoires avec le mou¬ 
vement permanent d’un fluide defini par les equations (5). De plus, ce mouvement 
est celui d’un fluide incompressible puisque nous avons l’identite 

< 6 > -r x +r y +<*v = o. 

Pour toute constante C> 3 et pour p suffisament petit, la «courbe de Vi¬ 
tesse nulle * 2Q- <7-0 aura une branche analytique ferraee autour de J } qui ne 
renferme pas le point S, et qui est symetrique par rapport k l’axe des x. Quand p 


469 



[271] G. D. Birkhoff: Sur le probl&me restreint des troia corps 


5 


tend vers zero, cette branche tendra analytiquement vers le. cercle r—/z' + y*—r 
oil f designe la racine <1 de l’equation cubique 

r. + ?_C-,0 (*). 

Le cas limite /i —0 (ou /i = l) est integrable, parce que le problfcme se reduit 
alors & celui des deux corps, l’un («/) de masse un situe & l’origine, et l'autre (P) 
de masse infinitesimale situe au point (z, y). II ne faut pas oublier que le plan 
des axes z, y tourne dans le sens positif avec la vitesse angulaire 1 autour de 
l’origine par rapport aux axes fixes. 3 

Nous allons etudier exclusivement le cas oil C>^32>3 et p petit, dans lequel 
le systfcme ne difffcre pas beaucoup du cas integrable //— 0 . 

2. Representation non singuli&re des monvements dans S 3 . 

Pour obtenir une representation sans singularite des etats de mouvement, re- 
venons au systfcme (1), (3) en faisant la transformation de Thiele et Levi- 
ClVITA (‘) 

(7) z-p — y —2 pq, dt— 4(p , + 0 *)dr, 

Cette transformation peut s’ecrire 

z — f ( w ) = p + w ', dt ^\ f { w)\ 9 dx 


oil z-z + K^ly, w -p + f=iq. On trouve alors, soit par le calcul direct soit par 
l’emploi d’une theorie generale, que les equations transformees analogues sont 


O') 

g-so- +?•)£- a,-- 

0+8 + 

(3) 

(2?M 


oil 



( 8 ) 

&(P. 7 . *0 = 2 ( 2 Q(p* 

-q\ 2 pq, p)- 0 ){p' + q*). 

Remarquons que Q* comme fonction de p et q reste finie et analytique m§me 


pour p—-y — 0 . 

Prenons maintenant les quatre variables p, q, p\ q' comrae coordonnees d’un 
« etat de mouvement *. Ici nous pouvons considerer p, q , p\ q\ comme les coor¬ 
donnees rectangulaires d’un point de l’espace d quatre dimensions 5«. Pour 


( s ) Voir I, sections 6-8. 

(*) T. N. Thiele: Recherchcs numtriques concemant les solutions ptriodiques d’un cas 
sp/cial du problems des trois corps , Ast. Nacbr., L 138 (1895). Cette transformation a St6 
ddcouverte Indlpendamment par Levi-Civita qui en a montrg 1’importance thgorique fonda- 
mentalc ((II), loc. cit.). 


470 



6 


G. D. Birkhoff : Sur le probUme restreint des trois corps [272J 

la valeur donn£e de la constante C nous aurons liquation (3') entre ces coor- 
donn^es, done 


(3") F(p, 9, P\ 9\ My C) =p'* + q’*-2Q*(p, q , p, C)-0, 


ce qui dSfinit un espace fini correspondant S a . Je dis que cet espace est partout 
r^gulier et analytique. En effet la fonction F est analytique partout pour ( p , q) 
dans la region 0*^0 autour du point S , et pour p\ q' quelconques. Par con¬ 
sequent nous n’avons qu’a d^montrer que les quatre dgrivges partielles F p , F q , 
Fp'y Fq> ne peuvent pas s'6vanouir en mfime temps, e’est-fc-dire que les quatre 


Equations 


<V-0, *V-0, P'-O, q'- 0, 


ne sont pas compatibles. Mais, dans le cas contraire, en employant (3') nous 
aurions 0. Cette equation n'a pas lieu k 1’origine p —y —0 du plan des p, q 
puisque Q* se reduit k 4(1— p) en ce point. Nous concluons que les equations 

2G(z, y, p) -C- 0, Q x - Q y _ 0 


ont lieu au point correspondant (z, y) 4=(p,0), e’est-ft-dire que la courbe de vitesse 
nulle autour de J aura au point (z, y) un point double, ce qui n’est pas vrai. 

Done, pour C>3 et p petit, Vespace fini S , difini par Vequation (3") 
est partout rtgulier et analytique, et variera analytiquement avec p et C. 


8. - Retour aux variables z, y, z', y\ 

Remarquons maintenant que pour chaque etat de mouvement en les va¬ 
riables z, y, z', y ' il corresponds sans exception un couple de deux points dis- 
tincts i 9* i p\ i 9') de S 3 . Pour le montrer il faut considerer separement 
le cas non singulier pour lequel (z, y) ne coincide pas avec J, et le cas singulier 
de choc oil (*, y) - (p, 0). Dans le premier cas les Equations (7) nous donnent 
prdcisdment deux points correspondants (p,, q t ) et (p t , q,) avec p 2 --p,, 

9i—-qi et (p„ ?,)=*= (p,, q t ). En difteren- 
tiant ces Equations (7) nous obtenons imm6- 
diatement les p,', q t ' t p 2 ', q t ' correspondants 

avec p*' — —pi't 9t -?«'• Done le r&sultat 

6nonc6 est vrai pour (z, y) 4= (p, 0). 

Dans le cas exclu il faut en premier lieu 
d^finir ce que sont les gtats de mouvement 
en les variables z, y, z', y'. Pour faire cela, 
observons que dans le plan p, q et avec le 
temps modifig r, les courbes du mouvement 
sont tout k fait rggulidres et an alytique s dans le voisinage de p = y-= 0 et que 
la vitesse est alors a peu prfcs \lS{l -p) =4=0. Cela nous montre que les trajec- 



471 



[273] G. D. Birkhoff: Sur le problime restreint des troia corps 7 

toires correspondantes du plan x, y dans le voisinage de J sont k peu prEs de 
forme parabolique ou meme possEdent en J un point de rebroussement (voir la 
fig. 2). Done, les Etats singuliers qui correspondent k un Etat de choc de P 
avec J sont uniquement dEterrainEs par la direction de la tangente en ce point. 
II s’ensuit qu’il y a deux points (p, q, p\ q') correspondant k un tel Etat, k 
savoir (o, 0 , ± ^ 8 ( 1 -//) cos ± ^8(1 — p) sin oil rp dEsigne l’angle que la 
tangente au point de rebroussement fait avec l’axe des x dans le plan x, y (voir 
la meme figure). 

Par consequent il y a toujours un couple de deux points distincts 
de j S 3 , (±p, ±.q, ±p', ±7')» correspondent a un etat quelconque de 
mouvement de (1), (3). 

4. - La nature topologique de S 3 . 

Un examen de liquation (3") de S 3 nous montre sans difficult^ (voir raon 
MEmoire I, section 8) que cette corrcspondance entre les couples de points Equi¬ 
valents de S 3 et les Etats individus de mouvement est telle que la rEgion fonda- 
mentale des Etats diffErents est topologiquement Equivalente k une sphEre dont 
deux points diamEtralement opposEs de la surface sont rcgardEs comme identiques. 
Plus exactemcnt: 

Les points individus de S 3 sont en correspondance biunivoque et con¬ 
tinue avec les points de I’espace ordinaire a trois dimensions complete 
par un point ideal d I'infini, dont deux points (£, q, C) et (£',»/, C) consti¬ 
tuent un des couples des points equivalents pourvu que 

En ce cas q dEsigne le rayon de la sphEre susdite. 


5. - 1^ mouvement permanent correspondant. 

A chaque point de l’espace reprEsentatif S 3 , les Equations (1'), (3') dEfinissent 
une direction dEterminEe par (dp, dq, dp\ dq'), et cette direction varie analyti- 
queraent avec la position du point de S 3 . II est k remarquer que dp, dq, dp', 
dq ' ne peuvent pas s’Evanouir en meme temps. Les trajectoires du mouvement 
en S 3 sont les courbes rEguliEres analytiques qui suivent partout cette direction. 

En introduisant les coordonnEes p, q et 


(9) 


dq 

V»=arc tang —, 


analogues k x, y et rp, on voit immediateraent que l’integrale \\ \ dpdqdxp restera 
invariante; il faut observer que cette intEgrale de volume prise partout dans S 3 
a une valeur finie, k savoir 2,7/1 ou A dEsigne l'aire k 1’intErieur de la branche 
de = 0 qui entoure J. Cette intEgrale invariante peut Etre Ecrite sous la 


472 



8 G. D. Birkhoff: Sur le probleme restreint des trois corps [274] 

forme j\fm(Q)dQ ou dQ dgsigne l’616ment de volume k trois dimensions dan6 S 3 
et m(Q) dSsigne une fonction de position Q dans S 3 . 

Calculons cette fonction m{Q). Pour cela, considgrons la couche infinitesimale 
entre les surfaces F~0 et F-dF de l’espace euclidien de p, q, p\ q'. Le vo¬ 
lume a quatre dimensions d V d’un element cylindrique dont la base dans S 3 (c’est- 
a-dire F— 0) est de volume a trois dimensions dv, et qui est situe entre ces 
deux surfaces, sera donnS oar la formule 4vidente 


dV dpdqdp'dq'— 


dF 


Vfp + Fi+Fl' + FZ' 


dv, 


puisque le premier facteur dans le troisi&me membre repr6sente la distance entre 
les deux surfaces. 

Mais les Equations (1') dlfinissent le mouvement permanent d’un fluide de 
l’espace a quatre dimensions, avec volume incompressible et tel que chaque sur¬ 
face F-D ( D <5tant une constante arbitraire) reste invariante; pour le voir il 
suffit de les ecrire sous la forme 


dp 


dq 


*~ p '> % mmS (P t + 9 t )9 f + O p \ ~--8 (p' + qW + GS. 

En ne considSrant que la partie de cet espace entre F-0 et F-dF, on en d<?duit 
que l’expression ... ^ 

•■U'ti+ri+ri+F*- 

repr6sente le volume invariant. 

D’autre part, introduisons les quatre variables p, q, F et y> au lieu de p, q, 
P'< 

< 9 ') v 1- arc tan %. 

P 

On trouve alors les Equations 

p'-)fF+2W cos y, q'~tF+ 20* sin y 
d’oa immldiatement f , , 

•>lP,q,F,V) d(F,y) “S' 

Done le mfime volume invariant peut Stre 4crit 

\ dF III dpdqdy. 

Par consequent on trouve, en comparant, que 

< 10 > m«»- 


\ 2D* + Sis* + ' 

done la fonction m(Q) est analytique et positive dans S 3 , corame nous l’avions dit. 


473 



[275] G. D. Birkhoff: Sur le probUme restreint des trois corps 


9 


Par consequent les equations (l')» (3') definissent un mouvement per - 
manent d’un fluide dans S 3 sans point d’equilibre, dont les composantes 
de vitesse dp/dx , dq/dx , dp'/dx , dq'/dx sont analytiques. Ce mouvement 
laisse invariante une integrate de volume de la forme J fj m(Q)dQ ou m(Q) 
est explicitement donnee par (10) et ou dQ designe le volume ordinaire . 
La fonction m(Q) est analytique et positive partout dans Si¬ 
ll est A remarquer que le mouvement d’un fluide dSfini par (1'), (3') n’est 
pas le nfeme que celui d£fini par (1), (3) quoique les trnjectoires se correspondent 
les unes les autres. En effet, le temps r qui entre dans (1'), (3') est tout A fait 
different du temps t qui entre dans (1), (3). 

6. - La surface de section S 3 dans le cas inferrable // — 0. 

Nous allons considlrer en premier lieu lc cas limite trfcs important p — 0, tout 
en renvoyant le lecteur au Mgmoire I (sections 9, 10) pour quelques details simples, 
sans en donner les demonstrations ici. 

Dans le plan des x, y le point P peut alors se mouvoir dans toute ellipse 
tournant autour du foyer O avec une vitesse angulaire — 1, pourvu que les demi- 
axes a et b de cette ellipse (a>|5|) satisfassent A lfequation 

(”) —-p+ 

2 la 1 

oil b doit fitre fegardg comme positif ou nlgatif selon que le mouvement dans 
l’ellipse est direct ou retrograde. Relativement aux axes de cette ellipse le point P 
se meut selon les lois bien connues de Kepler, le point attirant J etant de 
masse 1 et situe au foyer O. 

II existe alors pour tout C> 3 deux mouvements periodiques dont les courbes 
correspondantes dans le plan des x, y sont des cercles de rayons a, et a, (a,<a,) 
autour de O, tous les deux A l’inferieur du cercle de vitesse nulle autour de O. 
Ces quantifes a,, a, sont les racines feelles de liquation 

(12) ±2a'-l-Ca 

tifee de (11). Pour dtfduire cette relation il suffit de poser dans (11) 6 —±a. Le 
cercle de rayon a, correspond A un mouvement retrograde autour de J, l’autre 
de rayon a, (>a.) correspond A un mouvement direct. Dans l'espace S 3 les trajec- 
toires L , et L t correspondantes sont repfesenfees par deux trajectoires fermees. 
Considerons maintenant la surface Z t dans S 3 definie par lfequation suivante 

(13) G=pp' + qq'- 0. 

Celle*ci repfesente tous les etats de mouvement de S 3 pour lesquels la direction 
de projection est tangente en n'importe quel sens A un cercle du plan p , q ayant 
le centre au point O. Done les courbes Z., et L t sont des courbes situ£es dans Z t . 


474 



10 


G. D. Birkhoff: Sur le probUme restreint des trois corps [276J 


Pour // —0 liquation qui dyfinit S 3 se r^duit A 


(14) F=p '* + q'* _ 2 h(p* + ?*) - 0, 

ou 

A(p* + ?*) _ £>• _ 2 (pt + ,«)3 + 4 _ 2 £ 7 (p' + q t). 

La surface sera analytique partout si la matrice 


Fr 

F, 

Fp‘ 

F,- 

o r 

o. 

Op 

o. 


c’est-A-dire 


» — W. —4 qh’ 2 p' 2 q' 

p' 9 ' p g 


est de rang deux partout. 

Mais dans le cas contraire le rang sera au plus 1 en un certain point (p, q,p\ q') 
de S 3t ce qui n^cessite en particular 


<lP'-pq'- 0 . 

En employant (13) et cette dernifere Equation, nous concluons que p — q —0 
ou p'-q'- 0. Dans le premier cas la matrice se r<$duit A 

|° ° 2p' VII 

Ip' 7' o o II* 

dans le second cas elle se rlduit A 


I — «J*' “ 4 qh' 0 0 II 

0 0 p * II- 

II s’ensuit done que nous avons ou p — q—p' — q'—Q 0 u h'—p'—q' — O. Mais 
la premiere possibility n’a pas lieu comme le montre (14). La deuxteme possibility 
doit aussi fitre exclue; en effet legality A'-O se ryduit A C-3(/>* + ?=)*, ce qui 
contredirait notre hypothAsc C> 3. 

Done la surface fermye Z 3 est analytique partout. 

II est presque yvident que 2*, a la nature topologique d’un tore. En effet consi- 
derons les points de 2, correspondant aux fitats de mouvement pour lesquel 
P' + ^-e’’ od Q a une valeur donnde. Nous consid«rons o comme positif ou 
ntigatif selon que la tangente est directe ou retrograde. Ainsi nous obtenons une 
suite de courbes ferm«es C„ qui couvre 2'- complement. Pour p = 0 et aussi 
pour e dgal numiSriquement au rayon du cercle de Vitesse nulle, il n'y a qu’une 
seule courbe C e . Done 2, est un tore. 

De plus, 2, sera divisee en deux parties annulaires par L, et L, qui se trouvent 
parmi les courbes C e , i savoir et C,-_respectivement. D6signons par 5, 

la partm annulaire pour laqueUe et par 5,'la partie compiemen- 

taire p'-s )a it q ^Ya t . La partie S t va jouer un role important dans ce qui suit. 


475 



J277J G. D. Birkhoff: Sur le probleme restreint des trois corps 


11 


Introduisons raaintcnant p et 0 —2tan _l q/p= — 2tan"‘ p'lq' comme para- 
metres de surface dans 2 t . Cette surface s’exprime en fonction de ces paramfctres 
comme il suit t j 

p=g cos - 0 , 7 = 0 sin- 0 , 

t P ' —j'aSfei) stale. ? '=i'2A(p)cos5 0. 

La variable angulaire 0/2 d£signe Tangle form 6 par l’axe positif des q et la di¬ 
rection de projection. Done g et 0 sont des pararafetres normaux analytiques sauf 
le long du cercle de vitesse nulle, ou A(o*) s’gvanouit tout en restant analytique. 

La surface annulaire S t dans Sj difinie par (15) pour — \a t 

4_ _ • 

est done analytique partout, et admet g — ± + y t — ± \p i + q* et 

0 — tan - * y/x =2 tan~* q/p comme param&tres normaux de surface, meme 
le long des bords p=—, p— 

Une extension analytique convenable de S t pour p*=0 jourra le rdle d’une 
« surface de section » dans ce qui suit. 

Remarquons, en concluant, que les deux points (p, 0) et (p, 0 + 2*) sont des 
couples correspondents de S, qui represented le mfme etat de mouvement en 
les variables x, y, x\ y'. 

Nous allons consider en general ces deux points comme identiques, done 0 
comme une variable angulaire de p£riode 2n. 

7. - La plriode relative en r pour /4 —0., 

II est evident qu’en les variables regularisantes p, q, r, tout mouvement aura 
le caractfcre general suivant. Le rayon p croitra r4gulierement de sa moindre 
valeur p' jusqu'& sa valeur la plus grande g" tandis 
que Tangle polaire correspondent o — tan -1 qjp 
croit d’un certain angle a; puis le rayon de- 
croitra d'une mantere symetrique, tandis que o 
croit encore du meme angle a. Done on obtient 
un arc correspondent PQR dans le plan p, q, 
tel que < POQ— < QOR — a et tel que les arcs 
PQ et QR sont symetriques par rapport b OQ 
(voir la fig. 3). 

On peut alors continuer la courbe de mouve¬ 
ment indtffiniment en faisant des reflexions suc- 
cessives des arcs d£j& obtenus autour de OR, OP, etc. Nous d 6 signerons le temps t* 
correspondent b un arc complet, tel que PQR, comme la p 6 riode relative pour 
des raisons assez 4videntes. 

Remarquons que la periode relative dans les variables regularisantes p, 
q, t a la valeur na^/2. 



476 



12 


G. D. Birkhoff: Sur le probUme restreint des trois corps [278J 


En effet, suivant la loi des aires relative aux axes x, y nous avons (voir I* 
section 9) „ (- + j) _ ± 

oii a designe le demi-grand axe et e designe l’excentricite du mouvement. Si l’on 
introduit ici 0=2o, r«=p*, dt^Ao'dx, il en resulte immediatement 

D’autre part l’intlgrale (3') nous donne pour /*«=() 




ce qui nous permet d’obtenir dg/dx explicitement moyennant liquation prec6- 
dente. Ainsi nous obtenons dx/dg en fonction de g. En laissant g crottre de g r 
k g" il en resulte l’dquation suivante pour la p€riode relative r* 




ede 


^VAte*)-4(±Ka(l — «*) — 


V 

-fi 


dr 


^2rA(r) - 4(± Ka(l-«*) - **)• 

Je dis que nous avons ici 

P(r) - 2rh(r) -4(± a ^l-e* -r*)* = 4(r-r')(r" - r)/a‘. 
En effet nous avons explicitement 

A(r)-2H-20 + 4, 


ce qui nous montre que le polynome en r, P(r), est du deuxifcrae d6gr6 seulement. 
D’autre part dr/dx s’gvanouit pour r—r' et r—r", done il faut conclure que r' 
et r" sont les deux racines de l’equation quadratique P(r)—0, ce qui deter¬ 
mine P(r) k un facteur prfts. En faisant r—0, l’on obtient. par comparison l’ex- 
pression explicite que nous avons dej& ecrite pour P(r) t puisqu’on a r' — a(l — e) f 
r"-a(l+e). 

En substituant cette valeur de P(r) et en integrant on obtient t*— 7ra 4 /2. 


8. - La transformation correspondante T pour /* — 0. 

Toutes les trajectoires autre que L, et L t qui constituent les bords de St 
traversent S 3 dans le m^me sens; cela revient k dire qu’aucune des autres trajec- 
toire ne peut etre tangente k S» dans S 3 . Autrement en un point ou la trajec- 
toire serait tangente k S Jt la derivee 


dG 

dx 




477 



J279] G. D. Birkhoff: Sur le probl&me restreint des trois corps 13 

<levrait s’6vanouir. En employant (1'), (3') pour —0 avec Q*-~h(p l + q*) t et en 
cmployant les paramfttres g, 0 cette dgrivge prend la forme suivante: 

^-2(A+4 e >^ + e'A')- 

En Scrivant A — 1 — Cg*, nous pouvons Scrire: 

^ - 8(2e‘ +a+ 2e» te‘+i+A), 

— 8(2e 3 —A)(2e ' + A) 

" 2q' + A-2q'Vq*+\ + a' 

Mais la comparaison avec (12) nous montre que les deux facteurs du num<S- 
rateur s’annulent pour p — — fa t et p — !a t respect*vement, done que dG/dx est 
d’un mfime signe sauf le long de L t et de L t oil cette fonction s’6vanouit; de 
plus, dG/dx dost fttre positive puisque elle est €gale ft 8 pour p —0. 

La d£riv£e dG/dx est infiniment petite seulement du premier ordre dans le 
voisinage des bords L, et L t par rapport aux infiniments petits p + J^ et p — }/a t 
respectivement; cette conclusion r^sulte du fait que — )[a t et +\fa, sont des ra- 
cines simples de (12). 

De la mftme maniftre on voit que toute trajectoire traverse St dans un seul 
sens avec dG/dx <0 en tout point intlrieur. 

Jusqu’ici nous n’avons pas dlmontrl que toute trajectoire autre que L, et L t 
doit traverser S t (et St). Mais nous avons dgjft remarquS que la fonction p* + q l 
est pdriodique, de pSriode no */2 en t le long de toute trajectoire dans ce cas int£- 
grable, en croissant de p'* ft p** pendant la moiti6 de sa pSriode et en dlcroissant 
de p"* ft p'* pendant l’autre moitte. Par consequent sa dfiriv^e, 2(7, s'6vanouit 
pr^cisement deux fois dans une p^riode, une fois avec dG/dx>0 , l’autre fois 
avec dG/dx < 0. Done toute trajectoire autre que L t et L t traverse indefiniment S t 
et St en des intervalles successifs na^/A. 

Soit (p, 0) —un point quelconque de St et soit (pi, 0,) — P t le point de S t 
qui suit P le long de la raftrae trajectoire. Evidemment on a pi —p. De plus, la 
periode d’un mouvement elliptique de demi-grand axe a est 2na done, pendant 
cette p£riode, le grand axe tourne dans le sens nggatif d’un angle 2 ncP compt6 
ft partir de l’axe des x. II s’ensuit que 0 diminue de 2na} dans une pgriode, 
e’est-ft-dire que 0, —0 — 2 - 10 '. 

Done Vespace S 3 conticnt la surface analytique annulaire S t difinie 
par (15) ou les paramitres g, 0 sont des paramitres normaux avec 0 pirio- 
dique de pAriode 2-i, et —Les deux bords p— — fai et p— \fa t 
de St sont les trajectoires circulates retrogrades et directes respectivement. 
Toutes les autres trajectoires traversent St dans chaque intervalle na 1 /2 de 
temps x et dans le meme sens. De plus, Vangle forme par St et une telle 


478 



14 


G. D. Birkhoff: Sur le problbne restreint des trots corps [280J 


trajectoire n’est jamais nul, et reste meme du premier ordre par rapport 
a g + lfai et g — \?a, dans le voisinage des bords. Si Qi = (gi t 0t) est le point 
qui suit Q = (g, 6) le long d’une trajectoire, on dtfinit ainsi une transfor¬ 
mation 

(16) T: Q,-g, e.-g-tota^e) 

de St en elle-meme qui est partout biunivoque et analytique, meme le long 
des bords. La fonction a(g) qui entre id a Vexpression analytique suivante: 

a(g) —_ 1 

c— 2e« + 2e V— ce* + 2' 

Cette transformation et son extension analytique convenable pour p 4= 0 joue 
un rble fondamental dans ce qui suit. 


9. - Les mouvements p4riodiques L t et L t pour p 4= 0. 

Les deux mouvements circulates qui existent pour /i —0, a savoir le mou- 

3 

vement retrograde L x et le mouvement direct L„ admettent, pour 0^32 et 
pour p assez petit, une extension analytique, symetrique par rapport a l’axe des x 
(ou des p). On aurait pu prevoir une extension symetrique puisque (1), (3) et (1'), 
(3') ne sont pas modifies quand on remplace x, y, t ou p, q, t par x, —y, —t 
Pt —q> respectivement. L’existence d’une extension analytique a ete pour la 
premiere fois demontree par Poincar 6 (loc. cit.) et plus tard avec d’autres methodes 
par M. Levi-Civita (voir son Memoire I), M. Moulton (’) et moi-meme (voir I, 
section 12). Done pour de telles valeurs donnees de C ces mouvements perio- 
diques L t et L t varieront analytiquement avec p pour p^p, oil p 0 > 0 . 

Faisons une remarque importante concernant le voisinage de L t et de L t pour p 
assez petit. Par un changement conforme de variables de la forme 

x + ly - f(u + K=lv), dt-\f'(w) \*d* 


oil f est une fonction analytique de w-u + \^lv, C et p (voir I, sections 2, 3), 
on peut transformer la courbe du mouvement L x par exemple en l’axe des u du 
plan des u, v avec u egal & l’arc et r — / le long de cet axe, de fa$on que les 
Equations modifies soient periodiques en u. De cette mantere on obtient liquation 
du deplacement normal le long de L t qui se reduit pour p = Ok 


(fin 

dt* 


+ a, — 0. 


Pour p petit cette equation sera de la forme 


*£+Itn- 0 


<’) Voir son livre: Periodic Orbits, Washington,: 1920. 


479 



[281] G. D. Birkhopf: Sur le problbne restreint des trois corps 15 

oil I(t, p, C) est analytique en t, p et C, et p^riodique avec la pgriode T du 
mouvement; de plus, on a I(t, 0, —a, - *. Mais par une transformation con- 
venable de la forme 

bn* — l(t, p> C)bn, t* *= <p(t, p, Ot 

oil l(t, py C) et <p(t, p, C) sont analytiques en t, p 1 C avec /4=0, d<p/dt >0 et 
oil les fonctions l et drp/dt sont p£riodiques en t avec la p6riode T\ on peut 
transformer liquation 6crite plus haut en une Equation semblable 

*£l + 6n--0 (•), 

dont les solutions sont toutes plriodiques de p^riode 2.-r, et telles que les zSros 
des solutions se suivent en des intervalles de valeur n exactement. Soit t t *(p, C) 
1’intervalle de l’axe de qui correspond & la trajectoire pgriodique L k ; cet inter- 
valle d6pendra analytiquement de p et de C. En considgrant les z6ros alternatifs 
de bn* et de bn, il s’ensuit que le coefficient de rotation (’) correspondant est 
pr6cis4ment k i — 47i*/t t *(p 1 C). Par consequent les solutions bn dans le voisi- 
nage infinitesimal de L t le croisent successivement, et ainsi definissent le coef¬ 
ficient analytique de rotation k t (p, C) qui se reduit & 271(1 — 0 ,*) pour // — 0. 

De cette mani&re on voit que le mouvement periodique L t est de type for- 
mellement stable (elliptique ( ,# )) pour p petit, si k t /2n n’est pas un nombre 
rationnel. M. Levi-Civita a montre (loc. cit., I) que dans le cas d’un rapport 
rationnel le mouvement peut etre de type forraellement instable (hyperbolique (")). 
De la mfime mani&re, L t sera du type stable si k t /2n n’est pas dgal & un 
nombre rationnel. 

En rtsumant, des continuations analytiques symitriques L t et L t existent 
pour p assez petit et definissent des coefficients de rotation k t (p , C) et 
k«(p, C ) respectivement determines par Vordre successif des croisements 
n ccessa ire men t simples de L t et L t dans le plan des x, y par une courbe 
de mouvement dans le voisinage infinitesimal. Les fonctions £,(/*, C) et 
k t {p t C) sont analytiques en p et C et se reduisent a 2n(\—a\) et 2ti(1— a!) 
respectivement pour p— 0. Le mouvement Li (i—1,2) sera formellement 
stable au moins si k t (p , 0)/2ji n’est pas egal a un nombre rationnel. 

Voyons ce que signifie cela dans l’espace S 3 . Un tel croisement au point ( x , y) 


(*) Voir on cet £gard E. Swift: Canonical Forms for Ordinary Homogeneous Linear 
Differential Equations of the Second Order with Periodic Coefficients , American Journal 
of Mathematics, t. 50 (1928). 

(*) Pour la definition du coefficient de rotation, voir par exemple III, section 45. 

(*°) Voir III, sections 42-46. 

(") Voir III, sections 27-41. 


480 



16 


G. D. Birkhoff: Sur le probldme restreint des trois corps [282] 

de Lt par exeraple est caracterise par les Equations x—x(t), oil t de¬ 

signe le temps le long de L ± , et par liquation (3'). En eiiminant / ces Equations 
nous donnent une surface Si" k deux dimensions qui contient la trajectoire Z.,. 

De plus cette surface sera rSgulifcre et analytique le long de L t . En effet, 
si x'*0 au point consider 6 on peut ecrire ces conditions sous la forme 

y-f(x)- 0 ; i(*'»+y'i)_c+|_o 

oil y — f(x)—0 est liquation de la courbe L t . La matrice correspondante est 

« —r i o o 

-A, x' y ||’ 

qui n'est pas de rang un k moins que x'—y' — 0, ce qui n’est pas possible. Le 
mfime ra^sonnement s’applique k tout point de L t ou y'4=0. Par consequent la 
surface Si" doit Stre regulifcre et analytique dans le voisinage de L t . 

Remarquons aussi que Tangle y forme par la surface S}" et une trajectoire 
devient nul seulement le long de L t . En effet pour x'=t=0 cet angle a un sinus 
de Tordre de grandeur de 

u - £-'•<*)) X-1;. 

oil ds designe 1’eiement d’arc de la trajectoire. Mais on a aussi 

y' 

p— /'(*) — tan /? — tan a 

oil a designe Tangle compris entre la direction de la courbe de L x et Taxe des x 
du plan des x, y, et oil designe Tangle analogue pour la courbe de la trajec¬ 
toire. Par consequent Tangle y est du premier ordre par rapport k /¥—a, c’est- 
k-dire par rapport & la distance du point consider k L t . Si Ton a s'—0 il 
suffit d’echanger entre eux x et y. 

Done les surfaces Si" et S{ n qui correspondent aux deux ensembles 
des Mats de mouvement (x, y, s', y') pour lesquels (x, y) est situ6 respecti- 
vement sur les deux courbes de mouvement de L, et de L tr contiendront 
les respectives trajectoires Li et L«. Ces surfaces sont rtguli&res et analy- 
tiques le long de ces trajectoires. De plus, I’angle de croisement de Si" 
ou de Si" avec une telle trajectoire est partout du premier ordre par rapport 
a la distance de L { ou de L t respectivement. 

Considerons maintenant une trajectoire voisine de L it par exemple, qui 
coupe Si" au point P sur un odte de L ,. La meme trajectoire coupera Si" plus 
tard en un premier point Q sur Tautre cote de L tt comme le montre les resultats 
oi-dessus. Puis elle coupera Si" en un point P t du meme cote. Ecrivons P t ^T(P). 
La transformation T t d’un cote de S}° en elle-meme doit etre analytique, sauf 


481 



17 


[283J G. D. Birkhoff: Sur le problime restreint des trois corps 

peut-dtre le long de L t oil elle reste continue et ddfinit le coefficient de ro¬ 
tation k t (p, O- 

Done les croisements successes d’une trajectoire voisine avec la par tie 
de S±" d'un seul coti de L i dffinis sent une transformation f t de S}" en 
elle-meme, qui est biunivoque et continue, meme sur le bord L lt avec le 
coefficient de rotation k t (p, C). Sauf peut-etre sur ce bord, f t sera analy¬ 
tique. II y a une transformation T : complement analogue par rapport 
ct S™ et L t . 

fividemment le coefficient modify k K (p, C)j 2.-r, par exemple, rdprdsente le 
nombre raoyen de fois que la trajectoire voisine circule autour de L t dans S 3 
tandis que la trajectoire voisine fait un tour complet de L t . 

Soit maintenant S t ' une autre surface rdgulidre et analytique le long de L t 
telle que toute trajectoire voisine la coupe avec un angle du premier ordre par 
rapport & la distance de L t . Supposons de plus que S,’ n’ait pas de point en 
commun avec S','" dans le voisinage de Z,,._II est alors gdomdtriquement dvident 
que la transformation de S t ' analogue k T, aura le mdme coefficient que f t le 
long de L lt k savoir k t (p, C). Une propridtd analogue aura lieu le long de L t . 

Dans la section suivante nous allons dtendre la ddfinition de S t au cas p 4= 0 
de fa^on que S t satisfasse k ces conditions imposdes sur S t ' et S t " le long de L, 
et L t k la fois. 

10.- La surface de section S t pour p*t=0. 

Les mdmes cercles p* + q l — const, dont les dtat^ tangents pour/4—0 consti¬ 
tuent St peuvent dgalement dtre regardds comme les cercles de mouvement 
pour C ^C; ici C* peut devenir infini, et en mdme temps le rayon du cercle 
correspondent se rdduit a zdro (I, section 10). Cela nous suggdre de ddfinir S t 
pour p * 0 comme 1’ensemble des dtats tangents aux extensions analytiques des 
mouvements circulates directs et rdtrogrades pour ^4=0. .J’ai donnd cette defi¬ 
nition autrefois (I, sections 10-14). Mais pour nos buts ici il faut aller plus 
loin encore. 

Pour le faire, nous allons considdrer en premier lieu 1'intervalle C^C*^,C, 
ou C est arbitrairement grande mais finie. L’abscisse x du point de croisement 
d’une telle extension analytique directe ou rdtrograde avec l’axe positif des x est 
une fonction analytique de p et de C* (voir I, section 11). Par consdquent, 
l’abscisse o de ce croisement en les coordonndes p, q est aussi une fonction ana¬ 
lytique de p et C 9 , et Ton peut dcrire pour les courbes pdriodiques symdtriques 
de mouvement direct ou rdtrograde, 

p-f{Q,p,C') t q—giO, p, C 0 ), 

ou 0 /2 ddsigne l’anomalie moyenne mesurde du point de croisement de l’axe des p 
en le temps r. Ici f et g sent analytiques en 6, p, C* et pdriodiques de pdriode 4* 


482 



18 G. D. Birkhoff: Sur le probUme restreint des trois corps [284] 

en 6 pour C^C*^C et pour p suffisament petit. De plus, nous avons 

Q — <P(M> C*) — <Po(C*) + p<pt (C*) + ••••! 
oil <p est analytique en C* et p. 

Mais pour le cas intlgrable /z — 0, ceci se r£duit k g=<p 0 (C*) y et un calcul 
imm^diat (voir I, les Equations (36)) nous montre que la relation entre g et C 
est d6finie par liquation 

c--2 e +i. 

Par consequent on obtient 

puisque |p|<l partout. 11 s’ensuit que la fonction <p 0 (C*) pour le cas direct ou 
retrograde est analytique avec rp «'=*=(), et que nous pouvons remplacer la va¬ 
riable C * par la variable g dans les equations au-dessus, ce qui nous donne 

p — 7(g, 0 , p), q — q(g, 0 , p), 

oil f et a sont analytiques par rapport aux variables indiquees et p6riodiques 
en 0, de periode 4 tt. 

Mais pour /z — 0 les paramdtres g et 0 ainsi definis coincident avec les para- 
metres normaux g et 0 definis plus haut. Par consequent la surface .S', variera 
analytiquement avec p, et g et 0 seront encore des parametres normaux pour p 
suffisamment petit, au moins en dehors d’un petit voisinage de la courbe 
fermee p —^ —0 de S t ( C * grande). 

Avant de passer au cas C* ^ C t remarquons qu’au moins pour les deux 
parties C^C'^C de S t , c’est k dire quand g n'est pas tres petit, la surface S t * 
jouit de certaines autres proprietes importantes qui sont valablcs pour /z — 0. 
Elle est traversec par toutes les trajectoircs dans le mSme sens, et Tangle avec 
une telle trajectoire devient petit seulement au voisinage des bords L t et ou 
Tangle reste prgcisgment du premier ordre. De plus, les coefficients de rotation 
le long de L t et L t doivent Stre pr6cisC*ment k ,(/z, C) et k t (p , C) puisque S t 
n’aura pas de points en commun avec les surfaces Sj l) et S ddfinies dans la 
section pr€c£dente. En effet les courbes des mouvements p6riodiques auxiliaires 
resteront compldtement k Tint6rieur de L x et de L t au moins pour p suffisament 
petit. Tout cela r£sulte imm^diatement de l’analyticite de la situation. 

II reste k consid£rer le cas C*^C. Pour €tudier ce cas il est d'une grande 
utility (voir I, section 12) d’employer la transformation suivante des variables 

( 17 > x -/z«x*x, y*=X* y, (a* — 

en remplagant C* par 1/A 2 partout. Remarquons ici que pour C* — oo on aura A —0, 
et qu’une courbe p£riodique, sym^trique, retrograde par rapport k Taxe des x r 


483 



[285] G. D. Birkhoff: Sur le probldme restreint des trois corps 


19 


pour une valeur donn£e de A, correspondra & une courbe semblable, mais directe, 
pour la valeur — X de A. 

Commencons par dgmontrer que les d^veloppements suivants ont lieu pour 
les mouvements pgriodiques symgtriques: 

j *” r=ah cos 

Ici a 3 b 3v ... sont analytiques en p, t, et les membres & droite sont pgriodiques 
en t pour p petit. Ces dgveloppements sont analogues a ceux que j’ai dgveloppgs 
(I, section 12 ), mais sont un peu plus explicites. 

Remarquons d’abord que les Equations difterentielles et 1’integrate de Jacobi 
ont les formes respectives: 


+ X’a 3 (p, 0 + 
0 + 


oil 


^-2 
dr- dt 


d : V 




;3 ,+ ©’- 2 "- (i - 3iv) 


tf-|A»(x* + y*) + 


1 -b 

»FTP 


si Von neglige des termes en X*p , A®//,.... dans les seconds membres de ces 
Equations. Les Equations ainsi obtenues sont rigoureusement celles qu’on obtient 
en partant d’un point de masse 1—& l'origine x—y —0 et en employant des 
coordonn€es x, y par rapport aux axes qui tournent dans le sens positif avec 
une vitesse angulaire X*. 

Pour employer ces faits rappelons quelques r^sultats de mon MSmoire. Consi- 
dgrons le mouvement particulier pour lequel le point de masse nulle se trouve 
pour 0 sur l’axe des x en (x, 0), avec, dans la direction de l’axe des y, la vitesse 
imposge par l’intlgrale de Jacobi. Si x ne differe pas bcaucoup de 1, la courbe 
de mouvement coupera 1’axe des x nggatif prfcs de x— — 1; en effet pour A —— 0, 
il y a un mouvement circulate direct de rayon 1 autour de x—0. DGsignons 
par *(x, A* v) 1® composante de vitesse dans la direction positive de l’axe des x 
& ce deuxidme croisement, qui s’€vanouit pour un mouvement p€riodique symStrique. 
Cette fonction sera analytique en x, X et p, et sera divisible par X 3 , comme je 
I’ai d£montr<5. En effet la fonction analogue *(x, X, p) rattachge aux Equations 
modifies comme ci-dessus ne peut difftrer de *(x, A, p) que par des puis¬ 
sances A®, A®,.... de A, et admet aussi le facteur X 3 . De plus dans la s6rie qui reste 
quand on enlfcve ce facteur il y a un terme a 3 (x— 1 ) avec a 3 =*=0qui se trouve 
aussi dans ^(x, A, p). Il s’ensuit que pour le mouvement p€riodique symgtrique 
modify, on aura pour x—1 la m§me s£rie de puissances de A et p que pour le 


484 



20 G. D. Birkhofp: Sur le probttme restreint des trois corps [286] 

mouvement periodique symetrique, jusqu'aux termes en X 3 au moins. Par conse¬ 
quent les series pour x, y seront les memes jusqu’d ce point. 

On peut obtenir les formes cxactes de ces series modifies. En effet les equations 
differentielles modifiees admettent un mouvement p6riodique symetrique circulaire 
qui est le mouvement en question. On l'obtient par rapport aux axes fixes sous 

la forme __ _ 

x = a cos a - ' pi — pt, y—a sin ar* /l— p t. 

Mais la vitesse angulaire des axes relatifs est X 3 . Par consequent les equations 
pour x et y tirees des equations modifiees sont 

x-a cos [a- • ►' 1 ^( 1 -**)<], y-a sin [a-> )T^<1 

et l’integrale de Jacobi nous sert & determiner a; jusqu’aux termes en A 3 on a 

1 —u 

a ~nr*x£- 

Ainsi les developpements de x, y pour le mouvement periodique symetrique 
doivent avoir les formes 6noncees jusqu’aux termes en X 3 . 

Introduisons maintenant l’anomalie moyenne t* definie par liquation l 9 — 2nt/T t 
ou T(X, p) represente la periode du mouvement. 

Pour la demi-periode on a £ —0, et t est prfes de ,-r(l — /<)• En employant 
l’equation qui donne y nous concluons que 

T- 2*<1 -,u) [l + | X>,,+X’c,(m) + ~]. 

En substituant, nous trouvons liquation 

<-<i - M ) [1 + ? x» m +x*c(m) +....] r, 

et en employant cette equation nous trouvons de plus 

j i — (l—p) cos r + 3X*p(l— p) cos /• +X 3 d,(p, f)+ .... 

I Z/-(l -P) sin r + 3X'/i(l-p) sin C + X*b 3 (p t /•)+.... 

oil a 3 6 3 ,.... sont analytiques en p et en 1’anomalie moyenne f, et periodiques 
de periode 2 jt en t*. 

Cherchons maintenant d passer des variables x, y, aux variables p, ~q t r* 
ou p — Xp, q — Xq et oil t* est l’anomalie moyenne en r et aussi en t oil t — Xr. 
fividemment nous aurons 

x—p'-q*, y-2pq, dt-4(p* + q*)dT. 

Nous pouvons reunir les equations ci-dessus en une seule equation complexe: 

x+^y-e*=ir[{\-p){l + ZX'p) + X*d i {p, <•)+_] 


485 



[287] G. D. Birkhoff: Sur le probleme restreint dea troia corpa 


21 


oil d 3t d,,.... sont des fonctions complexes analogues & 03,...., b ir ... Done on 
obtient immediatement 

P + f 1 lq-\ r i+7=ij-e V=l ' + <•)+-], 

\ P - cos £ + 1 X' M \ll- M cos <*) + 

I q—Vl—ft sin ^ + ? X'/iil — n sin + X , l,( f i, <*)+•••• 

Les coefficients k 3 /a,— sont ici analytiques et de periode 4 n en t *. Quand 
on fait croltre t * de 27 t, les variables p et q sont transformges en — p, — q respec- 
tivement. De plus ces fonctions sont l’une paire l’autre impaire en t*. 

Cherchons maintenant & introduire dans ces Equations l’anomalie moyenne z* 
relative au temps modifie t au lieu de C. Pour cela il faut etudier de plus prfcs 
les relations entre t, t* t z et z*. 

La periode T en le temps t depend analytiquement de X et p, et se r6duit 
a 2 t*(1— p) jusqu’au second ordre en XL Done on a 

*-[<1 -p) + X'm t (X,p)]l\ 

De plus, on a z* —27rz/7 T ' oil T' designe la periode en z. Mais de liquation 
dt— 4(p* + q*)dx nous deduisons A77f**/2w—4(p* + q*)dT. Done en employant 
la forme connue de T, p et q , nous obtenons 

dr-\(l + l*n,{X,^r))dr 

oil n t est analytique en XL, p, t m et periodique de periode 2 tz en t*. En integrant 
de 0 & 2n en t m , il r4sulte pour la periode en z: 

T'-^V+X‘p,(X, M )). 

Par consequent on aura 

dr‘- - (1 + X’ 9 ,(X, * f))dC. 

En integrant & partir de /• — 0 nous obtenons une equation 

T •-r + X'r,(X,p, n 

oil la fonction analytique r, doit etre periodique de periode 2n en t * et s’evanouir 
identiquement pour ** — 0. 

Done si 1’on exprime t* en fonction de z* dans les equations pour p et q il en 
resulte les equations analogues suivantes: 

p-fl-p cos ~ +XV t (X, p, z*>, q—\fl-p sin £ +X t g»(X t p, z*). 


486 



22 


G. D. Birkhopf : Sur le probUme restreint des trois corps [288] 


D’autre part pour r* —0 nous aurons q=* 0, et p se r6duit a g oik g est 
l’abscisse de croisement de l’axe des p dans le plan des p, q. Done on aura 

L’abscisse g joue le rfile d’une de nos variables normales de St, tandis que l’autre 
variable normale est l’anoraalie moyenne x* que nous d€signerons aussi par 0. 

Liquation que nous venons d’6crire nous montre que nous pouvons remplacer X 
par g, mfime pour X petit. En introduisant g au lieu de X nous obtenons 


P~6 cos | +g t ft*(Q,e,p) , 

Q 

q-g sin - + e'gSig, 0,p) , 


oik /«*, gt * sont analytiques en g, 6, p et plriodiques, de p^riode An, en 0. 

Cherchons maintenant des Equations analogues pour les composantes p', q' 
de vitesse pour la valeur choisie de C, e’est-fr-dire pour X—X 0 et p— g 0 . Soient r 0 
et t 0 les temps particulars correspondants & cette valeur de C. Nous aurons 

->_ _ dp dt• dt ^ dp dt* dt_ 

P "" dt 0 dO dt d r u dO dt dt 0 ’ 

puisque 8 —t*. Le premier facteur ft droite peut 6tre ici d6termin6 directement 
au moyen de la premiere Equation (18). Pouf obtenir le deuxifcme facteur nous 
n’avons qu’a remarquer que, selon les relations ci-dessus, 


i Vi — * 
e 


(i+p*r t (p, e, p)). 


Quant au troisidme facteur, nous avons 


dt _ l/2 Q(r, y ./i) — C 
dt 0 \ )-C* 


oik C—i/Xl, C* — 1/^*. ficrivons maintenant 

<p(e) ~ iK ‘~- -1+e V»(e) + 

*<*- 0, M) - gqfr.’.rt,, -1+ e'b.(0, a + „ 

En substituant, nous obtenons les expressions de p' et q ': 

/1Q . i P ~ “ 2 fi—P sin \ + 9o, 0, p), 

\ q'— 2 h—p cos - +g*g,*(g, 0, p)ya(g, g 0 , 0, p), 

*(Q, Qo,0, p)-\/1 + b(g, 9, ,u)(l - 

r v eo<p-(e 0 v 

oil les f t *, g 2 *, sont analytiques en les variables indiqu€es, de pgriode An en 0. 


487 



[28C] G. D. Birkhoff: Sur le probUme restraint des trois corps 


23 


Mais pour p —0 la matrice caracteristique 


dp 

dq 

dp' 

dq' 

dg 

dg 

dg 

dg 

dp 

dq 

dp' 

dq' 

dO 

dd 

dd 

de 


est de la forme 


cos 

e 

2 

sin 

e 

2 

A 

B 


0 


0 


<D IW 

*> 

8 

1 

— sin 

e 

2 


en nggligeant un facteur )[l—p dans la seconde ligne. Done la matrice est toujours 
de rang deux. Par consequent les paramfctres p, 0 sont normaux mfime dans le 
voisinage de p—0. 

En nous rappelant que (p, q,p', q‘) repr£sente le mfirae etat du mouvement 
que (—p, —q, —p\ — q') et la syraetrie des equations, nous parvenons aux 
resultats suivants: 

La surface de section S t constitute par Vensemble des Mats tangents 
de L* et L t * pour la constante C*^C est analytique partout dans S 3 et 
a pour paramMres normaux Vabscisse g d’intersection avec Vaxe positif 
de p et ianomalic moyenne 0 en x mesurte a partir de cet axe. Cette 
abscisse est prise positive pour L* (mouvement ptriodique symMrique 
direct) et negative pour L* (mouvement retrograde). 

Les coordonntes p, q, p', q' de cette surface St s’expriment par les 
equations (18), (19) ou f t *, g,*, ft*, g t * et a sont des fonctions analytiques 
de p, 0, p et de p, p 0 , 0, p respectivement, et ptriodiques de ptriode An en 0 , 
g 0 etant la valeur de g pour le mouvement ptriodique symMrique direct L t . 

De plus les fonctions f t *, g t *, f,* t g t * sont changees en —/*,*, — g t *, —f t *, 
—0s*, respectivement, quand on fait croitre 0 d’une demi-periode 2n, tandis 
que f t *, g t *, a sont paires et g t *, f t * sont impaires en 0. 

Le domaine de S t en (p, 0) est donni par les intgalitts ; 


g'-g( — X 0 , p)&g&g 0 —g(X 0 , p)— q" ; O^0^2;r, 

ou g(X 0 , p) est analytique en X 9 , p avec Xo—l/}^. La ligne p—0 de S t 
correspond aux Mats de choc. 

Toute trajectoire autre que L t et L, coupera St dans le meme sens, et 
Vangle de croisement restera pris de L t et de L t prtcistment du premier 
ordre par rapport a la distance de L t et L t . 


11.- La transformation correspondante T. 

Soit maintenant P = (g,d) un point quelconque de St et soit P t =(p it 0,) le 
point unique correspondent qu’on obtient en suivant la meme trajectoire dans le' 
sens positif de (p, 0) jusqu’au premier point de rencontre avec S t . fividemment 


488 



24 


G. D. Birkhoff: Sur le problime restreint des trois corps [290] 


pour p suffisament petit nous definissons ainsi une transformation biunivoque 
analytique T de S t en elle-m£me: 


( 20 ) 


T: e *-4e,0)i o t -e+g( e ,0), 


oil f et g sont analytiques en g, 0 et plriodiques de p^riode en 0. 

Nous pouvons repr^senter cette transformation g^om^triquement dans le plan 
des x, y, en construisant les courbes de mouvement auxiliaires L ,* et L 2 *. Nous 
obtenons ainsi deux feuillets de surfaces unis en un seul k l'origine (voir en parti¬ 
cular les sections 13, 14 de I) dont chaque point correspond k un seul 6tat de 
mouvement de St (*’)• Au moment oil une trajectoire traverse St, la courbe de 
mouvement correspondante devient tangente k une courbe de mouvement L* 
dans le sens positif, et inversement. Done en partant d’un point P de S 2 on 
obtient P t — T(P) de la mani&re indiqu£e. 

D’autre part la symtftrie en x, y, t nous montre que si P t — T(P), nous avons 
n^cessairement R(P) — T(R(P t )) oil R(P,) et R(P) d^signent les images sym4- 
triques de P t et de P respectivement par rapport k l’axe des x. Par consequent 
si ( g t , 0 t )—T(g,0) on aura necessairement (p, — 0) — T(g t , —0,), done liquation 
symbolique 

RTRT-I (/-I 1 identity. 

Cela nous montre que la transformation RT-U est involutive et que T—RU. 
Plus explicitement nous obtenons les identity analytiques correspondantes 


(21) j 9 = mQ,0), -[0 + g(g,0))), 

I -9(Q,0)=g(f(g,d), -[0 + g(g, 0))). 

La transformation fondamentale T d’un point de P de S 2 en un 
point P t — T(P) de S t qu'on obtient en suivant la trajectoire correspon¬ 
dante du point P = (g, 0) jusqu’au point suivant P t =(g t ,0 t ) transforme 
Vintirieur de S t en lui-mime d’une maniere biunivoque et analytique, 
avec g t , 0, de la forme (20) ou f, g sont analytiques sauf peut-ctre aux 
bords p —p', p —p". De plus les Equations (21) sont satisfaites, e’est-a-dire 
que T-RU ou R est la reflexion R(g,0)-(g, -0), et U est une autre 
transformation involutive. Le long des deux bords, f et g restent conti¬ 
nues, et la transformation des deux bords L t et L t en eux-memes est 
analytique avec des coefficients analytiques de rotation k, et k 2 respecti¬ 
vement (voir section 9 ci-dessus). 


12. - L'analyticite de T le long de L t et L s . 

Nous voulons maintenant d^montrer le fait important que la transformation T 
reste analytique, m£me le long de L t et de L 2 . 


( ,? ) A l’origine nous avons fail une convention speciale. 


489 



[291] G. D. Birkhoff: Sur le problime restraint des trois corps 


25 


Considerons par exemple le voisinage de L t . fividemment on peut faire un 
changement analytique des variables x, y, rp en u, v, w, valable le long de L t 
tel que L t soit representee par l’axe u—w—0 correspondent k une coordonnee 
p6riodique w de periode 2 jz et tel que la surface S, soit representee par la 
partie du plan m — 0 qui renferme l’axe des w comme bord. En de telles 

variables le systfcme reduit des Equations diff6rentielies deviendra de la forme ( ,3 > 

( 22 ) v > w )y ^ v, w), 

oil f et g sont periodiques de periode 2.t en w et oil f(0, 0, w) — ^(0, 0, w) — 0. 

Les Equations de variation le long de 0 s’ecrivent 

1 °- «’)**+< r .(0,0, w)6v, 

(23) d 

( ^ °» + 0, w)dv. 

La propriete fondamentale des solutions des Equations de variation est la 
suivante: si le couple u(w,p), v(w,p) forme une famille analytique de solutions u, v 
de (22) qui depend du paramdtre p et contient la solution (0, 0) pourp —0, le couple 
de fonctions de w, u p (w, 0), v p (w, 0) constituera une solution des Equations de 
variation. En effet le couple u p (w, 0 )dp, v p (w, 0 )6p repr^sente (k peu prfcs) une 
solution dans le voisinage infinitesimal de la solution donn£e. 

Remarquons en premier lieu que la fonction periodiquc /^(O, 0, w) est ici du 
mCme signe partout. En effet Tangle forme par la trajectoire qui passe par (0,v 0 ,u/ 0 ) 
du plan m— 0 (qui correspond k S t , avec i»©^0) et ce plan lui-mfime, a la valeur 
numerique sin~‘ f/\[l +f* + g* en ce point. Dire que Tangle est du premier ordre 
par rapport k la distance de la trajectoire periodique u — v—0 revient k dire 
que f v (0, 0, w) ne s’evanouit pas. Cette inegalite nous permet d’eiiminer 6v dans 
les equations de variation, et ainsi d’obtenir Tequation lineairc ordinaire du 
deuxi£me ordre en du suivante 


(24) 


(du'—r.duV 6u' — r„6u 

V T. ) -*• 7T 


-g u du — 0, 


ou le coefficient de du" est l/f 9 qui reste fini et ne s’evanouit pas. Par consequent 
Teiement du et sa derivee du' ne peuvent pas s’evanouir k la fois sans s’eva- 
nouir identiquement. 

Maintenant soit u(w ; v 0 , w 0 ), v(w; v 0t w 0 ) la solution des equations en u et v 
qui se reduit k 0, v 0 pour w—>w 0 ; 

(25) u(w 0 ; i> 0 , Wo) - 0, v(w 0 ; v 0 , w 0 ) - v 0 . 


( ,3 ) Les lettres f et g employees ici ne doivent pas £tre confondues avec lea mfirae* 
lettrea du dessua, dont elles aont tout k fait indlpendantes. 


490 



26 


O. D. Bxrkhoff: Sur le probUme restreint des trois corps [292] 


£videmxnent cette solution dependra analytiquement de w, v 9 , w 9 pour w et w 0 
quelconque et v 9 petit. De plus, pour i; 0 —0 le couple de fonctions u, v se reduit 
k 0, 0 quelque soit w 9 puisque la trajectoire correspondante devient alors iz— v=0. 
Done nous pouvons developper tz, v en des series de puissances de v comme il suit: 

J u ( w ’> v <>, u> 9 )—v 9 (Au(w; w 9 )+v 9 u(w; w 9 ) + ....), 

I v(w; v 0t w 9 )—v 9 (Av(w, w 9 ) + v 9 v(w; w 9 )+ 

Ici du , dv designe la solution particulifere des Equations de variation pour 
laquelle on a 

Au(w 9 ; iz> o)—0, Av(w 0 ; u/ 0 ) —1. 

Pour le dSmontrer on n’a qu’employer la propria fondamentale dej& raentionn€e 
de ces solutions et les Equations (26) du dessus. 

Soient Au*, Av* et Au**, Av** les deux solutions particuliferes ddfinies par 

Au\ 0)-l, Av* — 0 ; dw-(0)-0, *v**(0)-l. 

Ces solutions sont lindairement independantes. Done une solution quelconque en 
particular Au, Av peut fitre exprimee moyennant ces deux solutions indepen- 
dantes. On trouve ainsi 


(27) 

oil 


(28) 


I du ( w ’> w *)-[Au**(w 9 )Au*(w)-Au*(w 9 )Au**(w))lA(w 9 ) 
f Av(w; w 9 )-[Au**(w 9 )Av*(w)-Au*(w 9 )Av**(w)/A(u> 9 ) 


(i'u+ffjd* 

A{w) — Au**(w)Av*{w) — Au*(w)Av**(w) — — e 6 


Avec ces preiiminaires nous pouvons discuter l'equation en w, u(w\ v 0 , w 0 ) — 0 
qui donne les valeurs de w pour lesqueUes une trajectoire voisine de iz — v—0 
traverse le plan iz —0. En faisant disparaltre le facteur v 0 , cette Equation aura 
la forme 


(29) *<«,; »„ +vMw , t0,) + ._.—0, 

oil les coefficients des puissances de v 9 sont analytiques en w et w 0 . 

Mais pour v 0 = 0 cette Equation se reduit k liquation Au(w \ w 0 )~ 0. Selon 
les r€sultats bien connus de Sturm, cette Equation aura une infinite de racines 
simples plus grandes que w 9 aussi bien qu’une infinite de racines plus petites. 
En effet la definition meme de la transformation T le long des bords depend de 
ce fait (voir I, section 13) puisque le point (0, 0, w) de l’axe des w se trouve 
transforme preeminent dans le point (0, 0, iz/.(iz;)) de cette axe, ou w t designe 
la deuxi&me racine plus grande que diz — 0; le premier point de rencontre de 


491 



{293] G. D. Birkhoff: Sur le probleme reslreint des trois corps 27 

la trajectoire voisine et du plan u —0 aura lieu pour w dans le voisinage de la 
premiere racine w % pour z><0. En effet nous avons 

-done dv doit fttre positive pour w, puis negative pour la premiere racine plus 
grande de du = 0, puis positive pour w ,, et ainsi de suite. 

Ainsi liquation (29) pour v© —0 admet la solution analytique w — w t (w 0 ) oil w x 
est analytique en w„ tandis que w t (w 0 ) — tc, est pgriodique de pgriode 2.-r en w 0 . 
Mais pour t> 0 —0 la d6riv6e partielle de <P(w, ; ur 0 ,V 0 ) par rapport ft w x se rgduit 
pr<5cis6ment ft du'(w Xt w 0 ) qui est positive. Par consequent la solution g£n€rale sera 
de la forme w x — F(iv 0 , v 9 ) oil F(w 0 , v u ) est analytique en w 0 , v 0 , avec F(w 0t v Q ) — w n 
pgriodique, de p6riode 2.7, en w 0 , et telle que F(iv«, 0) crott avec M'o(^V.(w' 0 , 0) >0). 

Pour obtenir la valeur correspondante v t de t*, il suffit de substituer cette 
valeur de w t dans v t : r* — v(F(w 0t t’*); v 9t w Q ). Ainsi Ton obtient v, — G(w 9x v 0 ) t 
ou G est dgalement analytique en i* 0 et w 9t p^riodique, de plriode 2 n, en u> 0 et 
s’evanouit pour v 0 —0. 

Done T doit fttre analytique le long de L x ou de Z.., ce que nous voulions 
dtfmontrer. 

fividemment le mftme raisonnement nous montre que la transformation in¬ 
verse T~ x est aussi analytique le long de ces bords. 

En ces parametres normauz o, 0 la transformation fondamentale T dc 
la surface de section S, en elle-mcmc rcstc analytique mvmc aux bords L x 
et L t e’est-a-dire les fonctions f et g qui entrent dans Ics Equations (20) 
sont ana/ytiques en o, 0, p pour les ralcurs de q corrcspondantes aux 
bords L x et L : de S 1t au mo ins pour u suffisamment petit. 

13. - Sur la forme de T pour p petit ou C grand. 

Nous voulons maintenant considdrer de plus prfts la forme de la transfor¬ 
mation T pour p petit ou pour C grand. 

Puisque pour 0 nous obtenons le cas inttfgrable dejft discutd, nous pouvons 
<5crire T sous la forme 

Qi—L> + pr(o, ,, X, p), 0, «= 0 — 2.7* J a + ps(o, 0, X, p) 

ou nous avons dcrit 2 —1/| C et p—o/2, d — a X ! . Nous introduisons j> au lieu 
de q pour que les dimensions de L x et L s ne tendent pas vers z6ro quand X 
tend vers z6ro et C tend vers l'intini. Les Equations (18), (19) nous donnent 
alors des Equations analogues en p, q , p\ q’ ou p=>Xp t p — Xq, t—at; et ces 
Equations diff£rentielles et 1'integrate de Jacobi correspondante, soit en les va¬ 
riables p, q, t soit en x, y, t, nous montrent que pour 2 = 0 il y a un problftme 
dvnamique limite (voir section 10 ci-dessus), une surface de section limite St , et 
une transformation limite T. 


492 



28 


G. D. Birkhoff: Sur le probleme restreint des trois corps [294] 


En effet le probleme dynamique en les variables x, y, t se reduit a un pro¬ 
bleme integrable special de deux corps, a savoir celui d’un point, de masse infini- 
tesimale, (x , y), relatif aux axes fixes des x et y, attire par un point de masse 1 — p 
situ* a l’origine. Par consequent la transformation T se reduit ici a 1’identity g, — 
0,-0, toutes les courbes de mouvements etant des ellipses fermees. Done r et s 
doivent contenir un facteur L Les cercles de rayon plus petit que (1—,u) sont 
les courbes limites de mouvement L ,* et L t m ,avec les cercles retrogrades et directs 
de rayon (1—//) comme bords L t et L t respectivement (voir I, section 12). 

II est bien connu qu’il y a un autre probleme m^rae plus special que le pro¬ 
bleme restreint des trois corps dont il est un cas limite, a savoir le probleme 
traite par Hill: 


d ' x o 

dO * dt 





En ce probleme qui est tres important pour la theorie du mouvement de la 
Lune il existe une symetrie par rapport a l’axe des y aussi bien que par rapport 
a l’axe des x. Ce cas est obtenu en negligeant la « parallaxe * du Soleil. Le 
resultat que je vais ddmontrer m'a et 6 sugg6r6 par cette double symetrie du 
cas limite de Hill. 

Remarquons qu’en les variables x, y, t on aura pour la fonction 
corrcspondante le developpement cn serie: 


(30) (i(x, y, X, m) - j7== + | Vp+ I i*[(l + + <1 - n)y’) + X'/iO. + . 


Par consequent il y a une symetrie par rapport a l'axe des y qui est valable 
jusqu’aux termes du 8* degre en X et du premier degre en p ( ,4 ). Plus pr6ci- 
sement en remplaQant x et t par — x et — / les equations differontielles et 
l’integrale de Jacobi obtenues de (1) et (3) sont changees seulement en des 
termes contenant un facteur X*p. 

Supposons maintenant que nous ecrivons la solution des equations differen- 
tielles sous la forme 


y 0t Xq’, y 0 ' t Pt t — M» ^(*o> V o» Vo\ /*» M 

oil x, y, xy' se reduisent a x 0 , y 0 , V* respectivement pour t = t 0 . Les 
fonctions <t> et seront analytiques en les variables indiquees (sauf pour le cas 
du choc). Considerons la solution particulifere pour laquelle 

V.-O, i.’-O, 0,^,/i) —1 


( ,4 > Ou X est encore consider* comme le paramdtre de S 2 :X= 1 / )'C\ 


493 



[295] G. D. Birkhoff: Sur le probl&me restreint des trois corps 


29 


a to = 0. Cela veut dire que nou3 considSrons la famille des solutions dont chaque 
membre correspond a une courbe de mouvement qui a 0 traverse a angle 
droit l’axe de x avec une abscisse x 0 et avec une vitesse satisfaisant a 1’integrate 
de Jacobi. £videmment la courbe obtenue sera syntetrique par rapport a l’axe 
des x. 

Mais pour X —0 il y a une courbe de mouvement circulaire de rayon 1 —/x 
(direct ou retrograde). Done pour fx et x„ — 1 petits la courbe de mouvement 
coupera l'axe nggatif des x prfcs de x«= — 1 avec une petite composante de 
vitesse V(x 0 , A, fx) dans la direction de l’axe positif des x. Mais cette fonetten V 
doit contenir un facteur X 3 . En effet si l’on supprime les termes en X 3 , A 4 ,-., dans 
les deux equations differentielles elles deviennent les equations d’un point (x, y) de 
masse infinitesimale attire vers l’origine de masse 1— p suivant la loi de Newton 
dans un plan fixe; done la fonction modiftee V(x 0 ,X, p) ainsi obtenue s’evanouit 
identiquement, et nous devons avoir 


V(x 0 , X, M ) = X'{a(n){x. -1) + fi( M )X +....) 

ou nous aurons a 4=0 (voir I, section 11). Par consequent il y a une seule 

continuation analytique de mouvement circulaire qui coupe l’axe des x deux fois 

& angle droit. De plus cette continuation doit fitre syntetrique par rapport & l’axe 

des x et constitue une courbe fermee de mouvement. 

Mais si Ton omet les termes de & de la 8* puissance au moins en X et de 

la 1” puissance au moins en /x, cela ne changera V(x 0t X,/x) qu’en des puissances 

semblables; et la sgrie _ 

x 0 — 1 + S t (/x)X + S t (i*)X* + — 


qui donne x 0 sera la nteme qu’auparavant jusqu’aux termes de degr6 5 en X et 
de degr£ 1 en /i (un facteur en J 3 6tant Scarte). En substituant cette valeur 
raodifige pour x 0 dans les fonctions modifies 0 et If' on obtient une courbe 
syntetrique par rapport & l’axe des i et i l’axe des y. En effet dans le cas 
contraire, puisque les Equations et 1 ’integrate modiftees jouissent d’une double 
syntetrie on obtiendrait une autre courbe par reflexion par rapport & l’axe des y. 
Mais la continuation est unique. 

Il s’ensuit que la famille des courbes ferntees auxiliaires, qui dgfinit S t dans 
le probldme restreint des trois corps, aussi bien que 1’anomalie moyenne 9 jouissent 
d’une syntetrie par rapport & l’axe des y jusqu’aux termes qui contiennent un 
facteur X*/x. 

Choisissons maintenant la constante C. Les courbes p —const. de S t seront 
obtenues dans le plan des x, y en laissant X croitre de X — l/fC b l’infini; elles 
deviendront de plus en plus syntetriques par rapport & l’axe des y quand X tend 
vers z<5ro. Consid€rons deux points P et P qui sont syntetriques par rapport 
& O — n/2 done avec la nteme valeur de p et avec O + Q^n. Ces points sont done 
presque syntetriques par rapport £ l’axe des y jusqu’aux termes d’ordre X i /x; 


494 



30 


G. D. Birkhoff: Sur le problime restreint des trois corps [296] 


et les tangentes en ces points seront presque symgtriques jusqu’au mfime degr4. 
Par consequent les deux courbes de mouvement tangentes seront presque sym£- 
triques dans ce sens. 

II est maintenant trfcs simple de d^montrer que les deux points transformes 
T{q, 0), T~ i {g,0) auront des positions actuellement symetriques jusqu’au mfime 
degre; ce sont respectivement les points de tangence adjacents quand * croft et dgcroit 
a partir de f—0. En effet si l’on modifie une famille analytique en X, p seulement 
en des termes d’ordre X i p au plus, et en m6me temps une courbe tangente avec 
courbure differente de la m6me maniere, la position du point oil la courbe donnee 
est tangente a une courbe de la famille variera analytiquement par cet ordre de 
grandeur au plus, parmi les courbes de la famille. 
jyoii la conclusion concernant la forme de T: 

La transformation T est de la forme 


(31) 


£• — Q+iprfo o,x,p), 

(*®t 


o l —o—2*x*a*(e)+Xp3 t (e, 0 , x t p), 

2i‘p* + 2 Pq 


en les paramHres normaux g, 0 (g—g/X), ou r, et s t sont analytiques en les 
variables indiqutes et de ptriode 2* en Q pour p petit, et tclles que la 
transformation RT—U (R €tant une reflexion par rapport d 0 — 0) est 
involutive et que la transformation RT—U (R ttant une reflexion par 
rapport d Vaxe 0 — n/2) est involutive jusqu’aux termes en X'p. 


14. - Introduction des variables eunoniques p # , 0*. 

Jusqu’ici nous n’avons pas utilise le fait qu’il existe une integrate inva¬ 
riants / f I(g, Q)dgd0. A cette integrate invariante correspond le volume inva- 

riant fff dpdqdy/ (voir la section 5 plus haut et aussi I, section 14). 

En effet Mdment de l’aire ordinaire de S, aura la forme M(g, 0)dgd0 ou 
la fonction M(g, 0) est analytique et positive partout. Pendant le temps dx cet 
616ment traversera un Stement cylindrique de volume M(g, 0)dgddv sin codx oil v 
est la vitesse et a> est Tangle fornte par la surface S t et la trajectoire qui passe 
par l’616ment donnd. fividemment nous devons avoir 

Met, 6i)Vi sin co.do.dO.dt — m(g, 6)M(g, 0)v sin (odgdOdx 

& cause de l’invariance de J m(Q)dQ (voir l’dquation (10)). Ceci nous donne la 
fonction quasi-invariante de 1’integrate invariante 

(32) I(g, 0) — m(g, 0)3/(p, 0)v sin a > 


495 



[297] G. D. Birkhoff: Sur le problems restreint des trois corps 


31 


et son Equation fonctionelle 

( 33 ) hq, e)-ne„ o,) 

Un calcul direct nous montre que pour p = 0 l’on a 

(34) He, 6) - /(e) - ~ QtSkffi -<?*)• 

Nous allons maintenant introduire d’autres variables « canoniques » p", 0* de 

la mantere suivante. Definissons la variable p* comme £gale h 

ou le domaine de l’integration est une region annulaire et oil J reprlsente 1’inte¬ 
grate complete. Done on aura toujours 0^»p a :£l. 

e 

Maintenant, prenons 0* tfgale * 2a / 7(p, 0)dQ/K(g ) oil K repr£sente 1’integrate 

o 

complete obtenue du nunterateur en prenant 0 — 2a. En ces variables 1’integrate 
invariante stecrit tout simplement // dg'dO *. De plus p - , 0* sera 6videmment des 
variables normales pour S it au moins sauf le long des deux bords. Dans le 
voisinage de ces bords L % et L t nous voyons qu’on a 

p- - a(p - p')* +..... p* -1 + 6(p* - p)* +.... 

•_ 

respectivement, ainsi les variables [^p*, 0* et /\ — p*. 0* scront ici normales. 

Done les variables canoniques p # , 0* difinies par les equations 

I r" f 

\ o'—z(q) = / / /(<?. 6)dgd0/1 I Ho, 0)ded0 

(35) . 4 „ 6 '• 

I I He, 9)de/1 He, 0)d6 

6 6 

font correspondre a S t le domaine O^p'^1, 0^0*^ 2a avec 0* angulaire 
de piriode 2a. Les variables p*, 0* sont normales a Vint&rieur de S t9 
tandis que [p*, 0* et 11 — p - , 0* sont normales dans le voisinage des bords Li 
et L, respectivement. 

Les Equations de la transformation en ces variables s’ecrivent 


t! 


7(p, 6)dgdO/J 


) 0 l *-0--2.iA 3 a* 3 (p*)+x / is*(p-,0*,A,/i), (a*(p*) = a(p/7)) 

' 9 I 9r-Q' + ^r-(g\e-,Ap) 

En ces variables nous aurons encore la propriety de symetrie par 
rapport 0* —0 symboliste par la formule T=RU (R etant une reflexion 


496 



32 


G. D. Birkhoff: Sur le probUme restreint des trois corps [298] 


par rapport a Vaxe 0* = O) et la propriety de symetric approximative 
par rapport a 6*—n/2 symbolisie par T^RU, valable jusqu’aux termes 
d’ordre X*p au moins. De plus Vintegrate invariants se rtduit a ff dg‘dQ m 
done nous aurons l’identity 


**£_**£ de r = i 
d* # de • do* de • — 1 


16. - Une autre forme de T, 

II y a des avantages formels qui r&sultent de l’emploi en Dynamique th&orique 
<les transformations de contact, en associant la moitfe des variables dgpendantes 
donn&es avec les variables conjugu&es transform6es. Ce fait a 4fe reconnu depuis 
Hamilton et Jacobi. Pour notre probl&rae cela signifie en particular qu’on doit 
associer p* et 0,* et en m&me temps g * et 0*. Par l’emploi d’une telle asso¬ 
ciation PoincarS (loc. cit., t. 3) a r&duit le probl&me des trajectoires ferm&es a 
un probl&me des points critiques, au moins pour p « infinitesimal » et dans le 
voisinage d’un mouvement p&riodique donn&. 

L’emploi de nos variables canoniques g* t 0* nous permet d’aller plus loin dans 
cette voie, comme nous allons voir. 

Consid&rons 1’integrate c 

) '9i m ddf-g*d0') 

prise le long d’une petite courbe ferm&e de S t . A cause de 1’invariance des aires, 
cette integrate doit s’annuler toujours; done on aura identiquement 

(37) 0*) 


oil la fonction F(g m , 0*) est analytique & l’int&rieur de S, et continue partout. 
De plus si F devient F par continuation, apr&s le tour de la surface annulaire S 2t 
nous aurons aussi g t *dO t *—g*d0 9 — dF, done d(F-F) — 0. Par consequent on 
a F— F+ const., et la difference F—Fsera donn&e par l’infegrale j{g i *dO i * — p*d0*) 
prise autour de S t . Mais le long de L t on a p,* — p* —0, done i’infegrale doit 


se r&duire & z&ro. 

D’autre part si l’on a (p,*, 0,*) — T(g* t 0*) la proprfefe T—RU donne 
( g — 0*) — TXp,*, —0i*). Par consequent selon l’identife (37) de dessus 


— g*dO* -f g*dO* — dF(g* t -0,*), 

d’oil il s’ensuit que F(g* t —0,*) ne diff&re de F(g*,0*) que par une constante. 
Mais quand (p*, 0*) parcourt L x dans le sens positif, (p,*, —0 t *) le parcourt dans 
le sens n&gatif, et les deux points doivent se croiser au moins une fois. Done 
ces deux fonctions sont &gales identiquement. Nous concluons que liquation 
fonctionelle suivante a lieu 


<38) 


F(gf, -en = F(g\ 0«). 


497 



[299] G. D. Birkhopf: Sur le probUme restreint dea troia corps 33 

Mais pour p suffisament petit p t ) nous pouvons regarder g 9 et 0,* comme 
dea variables normales & l'int4rieur de S t aussi bien que g 9 et 0 # , puisque nous 
avons d9 l */d0> 0 pour p petit. Done nous avons 

g l 9 d6 l 9 -g 9 d0 9 ~dF(g 9 t 0 t 9 ) 

oh F est analytique en g m et 0,*, et pgriodique en 0 4 *. Par consequent nous 
avons aussi 

<39) gSdOS + Pde 9 -d(e m OS) + dF(g\ 0,*) 

oil F*~F+g 9 (6 9 — 0 l 9 ). Mais quand le point (g 9 , 0 t 9 ) fait le tour de S t , les 

variables g 9 et 0* —0 4 * reviennent & leurs valeurs initiales, done F(g 9 , 0 4 *) doit 6tre 
analytique en g 9 t 0,* et plriodique en 0 4 *. 

En utilisant encore une fois la symltrie dejft employee nous trouvons 

- jW - OfdgS- d( - p t *0*> + dFU>? t - 0*) 

d’oh, par soustraction, on obtient liquation (41) en bas. 

Ainsi, en comparant les deux cdtls de (39) on obtient le rgsultat suivant: 
Pour p^pi suffisament petit la transformation T peut s'^erire en les 
variables canoniques g 9 , 0* de la manidre suivante: 

(40) <?,•-$• +iVte*. 0»*)» 0'-0f + F e -(g 9 , 0 t 9 ) 

oil F est analytique en g 9 et Q 9 et p&riodique de p&riode 2n en 0 t 9 sauf 
sur les bords de L t et L t oil Von a 

Fmc+e'F.tff, a,*) et 0,*) 

respectivement avec F t et F, analytiques en les variables indiquies. De 
plus F satisfait d liquation de sy mi trie 

(41) F(g\ en-Fbf, -0-)-(p*-p 1 *)(0*-0 i *) 
aussi bien qu’d Viquation de symitrie approximative 

(41') F(g 9 , On-fiet 9 , *-0 9 )~(g 9 -g l 9 )(0 9 -0 l 9 ), 

valable jusqu’aux termes de Vordre X i p au moins en X et p, (Les variables p 
et A—1 I^C sont 8upprim£es ici). 

Remarquons maintenant que la transformation T k pour k quelconque ne 

change pas les aires jjdg'dO 9 , et que gt 9 et 0** peuvent 6tre employees commes 

des variables normales de S t pour p^pk (suffisament petit). Remarquons aussi 
que RT k —U k est involutive. De cette manure nous concluons plus g6n£ralement: 


498 



34 G- D- Birkhofp: Sur le problime restreint des trots corps [300] 

Pour p^pk suffisament petit la transformation T k (A — ±1, ±2,....) 
pent s’icrire 

(42) qS— e’+Fe*ke‘, o k ‘), r-e k -+tfrw, e k ‘) 

oil la fonction F {k) (g* t 0 k m ) est du mime caractire analytique que F(g m , 0*} 
avec lexpression explicits, 

(43) F*\g\ 0 k *) m F(g\ Of) + F(gf, Of) + ... + ?(<>;_„ 0**) + 

+eW-d/)+e,'(0,*-0 1 *)+... +gl- i (0 k '-e:_ i ). 

De plus on a T*-RU<*> ou U <*> est involutive et aussi T k —RU< k > 
oil *7<*> est involutive jusqu'aux termes d’ordre X*p au moins en X et p. 
Par consiquent on a liquation de symitrie 

(44) FM(g\ 0 k *)-F< k >(g k \ -0*)-(^-^*)(0*-0 k *) 
aussi bien que liquation de symitrie approximative 

(44') Fto(g\ 0 k *)-F< k )(g k \ *-0**)~<**-***)<0--0**), 

valable jusqu'aux termes d’ordre X i p au moins en X et p. 

16.- Les points invariants de T k (*—1,2,....). 

Les rEsultats de la section prEcEdente nous permettent d’obtenir une forme 
ElEgante pour les points fixes de T k pour p&p k . Ces points correspondent aux 
trajectoires fermEes qui traversent S t prEcisEment k fois en toumant l fois autour 
de L t et L t . Puisqu’on a en ce cas p*-p, 0 k -0 + 2ln on obtient des Equations (42> 

iJ*V.e*)+2fc,-o. 

Inversement quand ces Equations ont lieu en un point ( g *, 0/) les mEmes Equa¬ 
tions nous montrent que p**— g, 0**—0 + 2&i. Done nous concluons: 

Les points fixes de T k pour lesquels 0 croit de 2ln sont (pour p^p k } 
pricis&ment les points critiques de la fonction 

(45) F«> *V, 0 # ) s F«\g\ 0*) + 2hxg\ 

Dans le cas oil cette fonction F <*»*> ne possEde aucun point critique, les 
lignes F (k >^(g 0 , 0*)^F kt ne peuvent pas se croiser. De plus, L t et L t sont deux 
lignea fermEes de cette espEce. Quand la constante F k i & droite croit de la moindre 
valeur de F <*«*> (disons le long de L t ) & la plus grande (le long de L t ), oee 
courbes doivent s’Epanouir de E Lt. D'autre part si ces lignes de niveau oe 
croisent il y a au moins deux points critiques de cette fonction. 


499 



(301] G. J>. Birkhoff: Sur le probldme restreint des trois corps 35 

Done il suffit pour Vexistence d’un tel point critique de la fonction 

F< ktt >(g m t 0*) ctefinie par (45) que sa d&rivie partielle par rapport d q * ne 
possddc pas le meme signs (positif ou ntgatif) le long de Li et L t , ce 
qui veut dire que 6k—0—2Ui n’a pas le meme signe par tout sur les bords. 
Dans ce cos il y a au moins un second point critique. 

II est int6ressant de remarquer que, selon le dernier thgorfeme de ggomfctrie 
de Poincar£, si die — 0 —26s a des signes opposes le long de L t et de L t (sans 
<jue p soit petit) il existera au moins deux points fixes de T k de cette espfcce. 

Notre critfcrium de plus haut a l’inconvgnient qu’il ne s’applique pas aux 
points fixes de T k (6—1,2,....) sauf pour p^pk de fa<;on que l’intervalle de p 
devlent de plus en plus petit quand k croSt. Nous pouvons 6viter cette difficult^ 
on partie de la manure suivante. 

Supposons que (g*, 0*) soit un point fixe de T k de fagon que 

<r)-(e', o-+ 2 b,). 

Done pour i, nous aurons 


<46) 


+£,•(**, on, e'-e,'+F t .( e , en, 
e‘—e<' + F»Ae,’, en, e t '-o t -+F„.( e n en. 


e'-el-.+FAel-,, e'), +2 bt+F^ftU. O'). 


On voit ainsi directement que si I’on icrit 


<47) ■ F(g\ Or) + ... + F(g^ lt 0‘) + 2lngl_ % 

tout ensemble de points fixes de T tel que 

(e', O'), (en en- Tie', o'),..... < e -, e-+ 2U)-T( e ;_, , or.,), 


est fourni pour p^p x par les points critiques de la fonction F ik) de (47) 
oil les variables £*,.•••» pj_*, 0*,...., 0J_, sont r€gard€es comme 2k variables 
ind,6pendantes. 

Dans ce qui suit nous aurons souvent & consid€rer les entiers k et l d’un 
point pgriodique P=(g, 0) de S t , e’est-ft-dire le moindre entier positif k tel que 

T“(e‘, o’)-(e', o'+2in), 

l 6tant un entier. Nous les appellerons les « entiers caract€ristiques » de la 
trajectoire p6riodique correspondante. 


500 




36 


G. D. Birkhoff: Sur le probleme restreint des trois corps [3021 


17. - Continuation analytique des mouvements p£riodiques do /z — 0. 

Nous voulons maintenant employer ces instruments analytiques puissants pour 
6 tudier la question intlressante, non encore r^solue, de la totality des mouvements 
p6riodiques qui donnent des vgritables continuations analytiques de certains 
mouvements p<§riodiques pour p~0. Jusqu’ici l’on a toujours cru qu’il n’existe 
pas de tels mouvements autre que les mouvements ymgtriques par rapport & 
1 ’axe des x. Nous allons obtenir un critgrium qui p rmettra de rSpondre d<5fini- 
tivement a cette question aussitdt que le coefficiei.* de p dans le dSveloppement 

de F(g m , 0 m ) en une s6rie de puissances de p a 6t6 explicitement calculi. Pour 
des raisons que j’ezpliquerai il me semble bien probable que de tels 
mouvements, p&riodiques non symitriques par rapport a Vaxe des x, mais 
d peu pris symUriques par rapport a Vaxe des y t existent pour X assez 
petit. 

ficrivons les Equations qui d£finissent T de la mani&re suivante: 
i e.'-e - +w.(e', e-, X)+ M ' 9 ,( e % o\ x) + . 


Si nous exprimons et 6* en fonctions de g* et 0,* nous obtenons les series 

( 48 ') I + 0,*+2*i»a*V, ■»)) + — 

( e'-0,' + 27,X>a‘\ e %X)- MV ,(e\0r + 2xX>a‘\ e ',X)) + .... 

Rappelons maintenant les Equations (40). Nous en concluons immgdiatement 
qu’il existe une fonction F x dgfinie par le dgveloppement en s<5rie 

% F ^ 

F{ e ’, or, x, M )=2ni> I a“( e \ x)d e -+ M Fr<e’, or, x)+.... 

telle que 

j *>.<**, o l -+2»x>a- i ( e ’, x))=FMe\ er, 

1 Vt(Qi‘, 0‘ + 2.iX , a‘ t (e‘, X)) = -F le .(e’, or, X).... 

La condition precise pour que 1’on ait p t “ — g*, 0^ — 6* + 2ln est que la 
fonction F+2lng m ait un point critique en ( g u , 0 m ). Cela donne deux conditions 


if o', *) + /*£*.(?, e\ X)+„.. =0 

\ 2nX*a*'(g\ A) + 2bx + pF if -{g% 6% X) + .... - 0. 


501 



[303] G. D. Bibkhoff: Sur le probldme restreint dee trois corps 37 

Done il peut exister une continuation analytique de /x—0 au point (p*, 0*) si la 
valeur de g* satisfait k l’gquation ind£pendante de 0* 

(oil a*<“‘> est la fonction inverse de a*) et si, pour cette g # , 0* est une racine 

t Vf 

de l’lquation F t p(Q*, 0*, -l) — 0. II suffit pour l’existence d’une telle continuation 
unique que cette racine soit simple. Par consequent nous n’avons qu’ft examiner 
les racines de cette equation pour trouver de tels points fixes de T. 

Considerons d’une fa^on analogue les points fixes de la transformation 
iteree T k (£—1,2,..-). Ici il faut rempla$er a** par Ara #f . Nous trouvons que 
les conditions necessaires pour que g*, 0 • puisse donner lieu & une telle conti¬ 
nuation analytique, sont les suivantes: 

(a) (6) fifte-.P.i)- 0. 

En mfime temps une telle continuation unique existera si /’la-tr + O, e’est-ft-dire 

•x* 

si 0 * est une racine simple de liquation (6) en 0*. Ici Ff* designe le premier 

terme du developpement de F^ en s4rie de puissances de p. 

Mais selon la definition (43) nous avons explicitement 

Fr-F,(e\ 0-2 »l»a*V. ■»). *) + F,ie’, 0*- Wa*'( e *. i), ■») + •••• + 

+ F,(e% 0 *-2bdvV.i),a). 

Cela nous permet de donner les conditions de dessus sous une forme plus explicite. 

Pour que (g # , 0*) soit un point fixe de T k avec 0* # —0* + 2&r, qui permette 
une continuation analytique de p— 0, il faut que 

( (a) p --«•«->((- 

(49) | ( b) F,r(e',e'+^,x)+F l r(e',o'+*-£,x)+~..+ 

\ +F l(r {e‘, e-+ 2 fa, i)-o, 

oil Fi(g* t 0*, A) disigne le coefficient de p dans la s&rie qui donne F(g* t 0*, A, p) 
en puissances de p. Il suffit pour Vezistence d’une continuation unique 
que la fonction a la gauche dans liquation (49), (6) ait une dirivie non 
nulle par rapport d 0* pour cette meme valeur de 0*. 

Reznarquons que si 0* est une racine de ( b ), 0* + 2iji/A: le sera aussi pour 


502 



38 G- D. Birkhoff: Sur le probUme restreint des trois corps [304] 

»—■ 1, 2,~~, k — 1. Done les valeurs de 0* sont distributes uniformtment dans un 
sens angulaire. 

Jusqu’ici nous n’avons pas employt la symttrie de F { qui rtsulte par compa- 
raison de (41), a savoir 

(4i') £<e’. -o*. e , -2ni‘a‘ l (e’, i). i)- 

Cette Equation nous montre que la fonction F t jouit d’une esptce de symttrie 
angulaire par rapport a la courbe radiate 

0* + ni 3 a* , (p*, *)-0, 

qui nous permet de la dtvelopper en une strie de cosinus de la forme suivante 


-?,(?•, X)^F t „(<?•, X) + F lt (g\ X) cos (0* + nX>a*\g, *) + 

+ Fit(Q m , X) cos 2(0* + 7rPa #, (p*, *)) + .... 

De plus, pour une valeur de g • qui satisfait a (a), cette condition prend la forme 

£<e*. -«*. *> - F> (e*. o* + *£) 

et le point de symttrie est donnt par O'--In/k. Par consequent la fonction de 0* 
de ptriode 2n/k a la gauche dans (b) doit ttre une fonction impaire de 0*. 

Done pour toute valeur de g • qui satisfait a (49), (a), et qui se trouve 
entre g *' et p*", k et l n'ayant aucun facteur commun, il y a ntcessairement au 
moins 2k valeurs de 0* qui satisfont a ( b ); ils torment deux stries, 


gm n V* 2{k-l)U 

V -U, - - - 


et 



(2 k-\)U 
k » 


uniformtment distributes en sens angulaire. 

Mais il est bien connu qu’il existe deux mouvements ptriodiques symttriques 
simples correspondents qu’on puisse continuer de /i—0, et que tout mouvement 
ptriodique symttrique aura seulement un reprtsentant dans le voisinage de ces 
mtmes valeurs de 0. 

Done Vapplication des equations (49) donnent imm&diatement les mou¬ 
vements p&riodiques symttriques bien connus. Elle montre aussi que si la 
fonction a la gauche dans (49), (b) s’&vanouit pour d’autres valeurs de 0*, 
il correspondra a chacune des deux series de telles racines une continua¬ 
tion analytique unique (au moins si ces racines ne sont pas doubles), et 
aussi que les deux mouvements p&riodiques correspondents doivent etre 
non sym&triques par rapport a Vaxe des x. 

Suivons un peu plus loin la possibilitt de l’existence de telles valeurs de 0*. 


503 



[305] G. D. Birkhoff: Sur le probleme restreint des trois corps 39 
Si l’on developpe la fonction F} k> en une s£rie analogue dc cosinus, on obtient 

iy‘V. 0', i) + .?,*<<»•. i) cos *(«•—*)) + 

+ F,«(e*. i) cos 2 t(0--.TVa’‘(g’, »)+-.]. 
Liquation (49), (5) s’ecrit ainsi dans la forme analogue: 

F lk (g *, X) sin (W 4- Li) + 2 F t k(e\ *) sin 2*(0 # + *0 + 3/U sin 3A:(0‘ + hi) +.... - 0, 

oii les racines qui nous intercssent sont celles pour lesquelles sin k0* 4= 0. Rcmar- 
quons en particulier que 0*-^/2 n'est pas une racine de sin AO* — 0 pour k impair: 

Mais selon nos resultats de plus haut la fonction F{* } doit etre a peu pr&s 
symetrique par rapport d 0* — ^ +n a); et par consequent la fonction & 
gauche dans notre dernifcre equation doit etre & peu pr£s impaire par rapport 

& 0 B — Plus precisement, les coefficients F t k, doivent avoir des develop- 

pements en puissances de X qui commencent avec des termes du cinquifcme ordre, 
au moins, en X. D’autre part il semble bien probable que les autres coeffi¬ 
cients F 9 k, F 4kt .... contiendront en general des termes du premier ordre en X. 

De plus, la fonction F t (g*, 0*, X) est periodique et analytique en 0. Par con¬ 
sequent les coefficients F l0 , F n .. F t t,...., de sa serie en cosinus doivent decroftre 

rapidement, de fagon que 

| F,i |< Afo* |o|<l. 

La dependance asymptotique de F,i sur l’indice l dependra de la distribution 

et du caract&re des singularites de la fonction F t (g m , 0*, >1). 

Done, si les coefficients du developpement en puissance de X de Fu decrofssent 
geometriquement quand l croft, il paraft bien probable en vertu de liquation 
de dessus, qu’il doive exister d’autres racines rlelles pr£s de 0* — n/2 pour k 
impair. 

Neanmoins pour resoudre definitivement la question d’une extension analy¬ 
tique des mouvements plriodiques non symetriques de cette espfcce il semble etre 
ngeessaire d’etudier d’une fagon plus detailiee la fonction Fi(g*, 0*, X) et en parti¬ 
culier de determiner les coefficients de son developpement en puissanoes de X. 

Il est facile de construire dans le plan des x, y la figure limite d’une telle 
famille periodique non symetrique. En effet pour /* — 0 le mouvement general 
est celui d’un point qui se meut dans une certaine ellipse avec un foyer au 
point z=y — 0 suivant la loi des aires, tandis que l'ellipse tourne avec une 


504 



40 


G. D. Birkhoff: Sur le probteme restreint dea troia corps [306J 

vitesse abgulaire -1 par rapport aux axes. Dans le cas que nous considSrons, 
l’axe initial se trouve trts pr£s de l'axe des y et l'ellipse fait l revolutions com¬ 
pletes pendant une p^riode tandis que le point circule k fois (k impair) autour 
de l’ellipse. De plus, puisque X est trfcs petit et k est grand et d’ordre X~* au 
moms, 1 ellipse sera trds petite et sera parcourue beaucoup de fois pendant une 
p6node. Evidemment une teUe figure pSriodique limite n’est pas symStrique par 
rapport & l’axe des x. 


5 05 



Reprinted from the Proc. Nat. Acad. Sci. 9 February, 1935, Vol. 21, 
No. 2, pp. 96-99. 


GENERALIZED MINIMAX PRINCIPLE IN THE CALCULUS OF 

VARIATIONS 

By G. D. Birkhoff and M. R. Hestenes 
Department op Mathematics. Harvard University 
Communicated January 12. 1935 

1. Introduction .—A fundamental problem in the Calculus of Varia¬ 
tions is that of establishing the existence of extremals satisfying given 
initial conditions, and of classifying them. The first general existence 
theorem involving simple integrals was that of Hilbert. He was con¬ 
cerned with the existence of absolute minima, as have been Tonelli and 
others following him. In the enunciation of his minimax principle Birk- 
hoff 1 established the existence of extremals of higher type. The subse 
quent fundamental work of Morse 2 has developed systematically by the 
use of methods of topology the existence and classification of still more 
complicated types of extremals. It is part of the aim of the present paper 
to show that all these further extremals can be obtained by the aid of a 
natural and simple extension of the minimax principle.* 

Our abstract formulation of this generalized minimax principle hinges 
on a suitable definition of extremals of “type k" for a functional J(P) 
defined on a function space it. The definition is a natural one and is con¬ 
structive in character. At least in many non-singular cases it can be 
shown to be equivalent to the usual definition. We give a method of 
counting siich extremals of various types. These counts satisfy the in¬ 
equalities of Birkhoff and Morse. We have essentially reduced the 
question of existence of extremals of the various types to that of finding a 
class of deformations D k , the existence of which in many important cases 
is intuitively obvious. 

2. Classification of Cycles and Chains .—Consider a space ft in which 
the ordinary concepts of topology, such as ^-cycles, ^-chains, non-bounding 
^-cycles, addition (modulo 2), etc., are well defined. The number R k of 
independent non-bounding ^-cycles in a maximal set of such cycles is called 
the ife-th connectivity of ft. We admit the possibility of R k being infinite. 

Let 9?* be a class of ^-cycles which is closed under addition and which 
contains a set of R k non-bounding ^-cycles which are independent on ft. 
Let 37? 0 = Wo and 37?* (k > 0) be a class of ^-chains which is closed under 


506 



Vol. 21. 1935 MATHEMATICS: BIRKHOFF AND HESTENES 


97 


addition and has the following properties. The class 97?* contains 9?* as 
a sub-class and every 6-cycle in 97?* is in 9?*. The bounding (6 — l)-cycle 
of a chain in 97?* belongs to 9?*_i and every bounding (6 — l)-cycle in 
9?*_i bounds a 6-chain in 9)?*. 

If we denote by M k , N k the number of independent 6-chains in a maxi¬ 
mal set of such chains on 97?*, 9?*, respectively, then the following (not 


independent) relations are immediate if the numbers A/* (6 = 0, 1. m) 

are finite: 

.1/* ^ N k ^ R k ^ 0 (6 = 0. 1.m) (1) 

A/o = -V 0 , Mj — Nj = N j _ l — Rj_ x (j = l, . tn) (2) 

+ ... + (-l)*A/o = + /?*_, + ... + 

(-l)*/?o (3) 

+ ... 4- (-i)*A/o a Rk-Rk-i + Rt-i + ... + 

( — 1 )*/? 0 . (4) 

the equality in the last expression holding if and only if A’* = R k . If 
A/*+i = 0 then A’* = R k by (1) and (2). It is clear that if R k is infinite 
so also are the numbers A/* and N k . 


3. Generalized Minimax Principle. —Now consider a functional J(P) 
defined on the function space 1? having the properties described in §2. 
We assume that on each 6-chain J(P) has a finite absolute maximum which 
it attains at a set of points P, called the "critical k-scl of J on the chain." 

For each £-chain C* we define a set of admissible deformations S th 
of C* into a Ar-chain, which never increase J and which deform every ./'-chain 
0 2a on admissibly, non-bounding cycles being deformed into non¬ 
bounding cyles. 

Let D k denote some class of such admissible deformations S< k Q f ^-chains 
into Ar-chains which deforms independent non-bounding ^-cycles into in¬ 
dependent non-bounding 6-cycles. The class of all 6-cycles whose critical 
6-sets are invariant under D k will be denoted by 4 9?*. The set 9?* is closed 
under addition if we adjoin the null 6-cycle. Let 97? 0 = 9? 0 and 97?* (6 
> 0) be the class of all 6-chains whose boundaries are in 9?*_! and whose 
critical 6-sets are invariant under D k . Clearly 97?* and 9?* are related as 
in §2, except for the fact that there may exist a (6- l)-cycle in 9?*_, which 
bounds on ft but bounds no chain in 97?*. 

We make the following assumptions concerning the selected class of 
deformations D k . 

I. Every k-cycle can be deformed by a deformation in Z7* into a k-cycle 
in 9?*, possibly the null k-cycle. 

II. Every k-chain on ft whose boundary is in 9?*_! can be deformed into 
a k-chain in 97?* having the same boundary, by a deformation in D k . 

III. Every class of admissible deformations containing D k as a sub-class 


507 



98 


MATHEMATICS: BIRKHOFF A.VD HESTENES I'roc. N. A. S. 


and presenting independence of non-bounding cycles defines the same class 9ft*. 

Postulate II implies that the set 9J?* is closed under addition. We 
define the numbers M k , N k , R k as in §2. It is clear that these numbers 
satisfy the relations (1), (2), (3), (4). 

A point P of ft which belongs to the critical £-set of a £-chain in 
will be called an extremal of J of type k. The count of all extremals of type 
k is defined to be M k . The count of all extremals belonging to the critical 
k-scts of the £-cycles in 9?* will be defined to be N h . Extremals of type k 
and their counts are well defined by virtue of postulate III. ft should be 
noted that extremals of type j ^ k are invariant under D k while those of type 
j > k are not unless they are also of type k. 

Generalized Minimax Principle. — Under the above definitions arid 
postulates the critical extremals of type k of the functional J exist and their 
counts satisfy the inequalities (1), (2), (3), (4). 

4. Applications to the Calculus of Variations. —It has long been recog¬ 
nized that in the Calculus of Variations we are dealing with a functional J 
defined over a function space ft. For example, in the simple case in which 
we seek to find extremals of an integral / in a class of arcs joining two 
fixed points, each arc of the class is considered as a point of ft and the 
integral / as the functional J. 

In view of the results of the last section the problem of finding extremals 
of type k has been reduced to that of suitably defining /--chains and 
deformations D k on ft. The deformations used by Birkhoff and Morse in 
the case of simple integrals involve the use of broken extremals, and form 
a sub-class of the admissible deformations D k . 

The minimax principle as formulated above is one of extraordinary 
generality, applying, for instance, to multiple integrals as well as simple 
integrals. A very simple example in double integrals is the following one. 
Consider the region between two concentric spheres Si and S 2 . Let M be 
a three-dimensional analytic Riemannian manifold homeomorphic with 
the region obtained by identifying points of Si and points of S 2 which lie 
on same radii. Consider the class ft of surfaces on M which are homeo¬ 
morphic with the image of Si and have a finite area less than a constant C, 
sufficiently large. Such a surface is thought of as a point of ft. The 
connectivities of ft are clearly R 0 = R\ = 1, Rj = 0(j >1). Hence 
if the functional J(P) is defined to be the area of the point (surface) P 
it is clear that there must exist at least one minimal surface of least area 
and one minimal (minimax) surface of type one, provided, of course, that 
the existence of a suitable class of deformations be established. 

A full treatment and applications of the “generalized minimax prin¬ 
ciple” as well as a study of the “natural isoperimetric conditions” of the 
immediately following Note will be given by us in a forthcoming article in 
the Duke Mathematical Journal. 


508 



Vol. 21. 1935 M A THEM A TICS: B IRK HOFF A HD HESTENES 


99 


1 Dynamical Systems with Two Degrees of Freedom, Trans. Amer. Math. Soc., 18 
(1917). 

* For references to Morse's work of 1925 and later sec his book The Calculus of Varia¬ 
tions in the Large. Colloquium Pub. Amer. Math. Soc., 18 (1934). 

3 The theorem of BirkhofT ( Dynamical Systems, p. 135) that any surface homcomorphic 
with a hypcrsphcrc has at least one closed geodesic depends for its proof upon a simple 
case of this extension. 

4 It is understood that every sub-6 cycle of a *-cycle in is also in 9?*. and that 
homologous * cycle in 9?* having the same critical fc-sets are to be regarded as identical. 
A similar remark holds for 6-chains in 9);*. 


509 



Reprinted from Don Mathematical Journal 
Vol. I. No. 4. December. 1935 


GENERALIZED MINIMAX PRINCIPLE IN THE CALCULUS 

OF VARIATIONS 

By G. D. Birkhoff and M. R. Hesteneb 
Introduction 

In the study of the critical points of a function /(x», • • • , i,) one naturally 
begins with the maximum and minimum points. Similarly, the study of the 
critical extremals of an integral 

J = 

joining two fixed points begins with the properties of minimizing or maximizing 
extremals and the question of their existence. It was not until recently that a 
systematic study was made of critical points of functions and critical extremals 
of integrals which are not necessarily of the minimizing or maximizing type. 
This study seems to have had its beginning in 1917 in a paper by Birkhoff* in 
which he enunciated his minimax principle. Birkhoff treats only the critical 
points and critical extremals of the so-called type one. Beginning with a paper 
in 1925 Morse* has developed systematically by the use of Analysis Situs the 
existence and the relation between critical points and critical extremals of all 
types. A. B. Brown and a number of others have also written on this subject. 2 

The principal method used heretofore in obtaining the critical point relations 
is the following. A value b will be called a critical value of our functional/(P) 
if there is a critical point Q of /(P) such that /(Q) = 6. We now consider the 
connectivities of the domains/(P) ^ 6, as the constant b varies from the abso¬ 
lute minimum b 0 of /(P) on the domain under consideration. It is found that 
the connectivities of / ^ 6 change only when the variable b passes through a 
critical value of /(P). The change in connectivity depends upon the type of 
the critical point P having this critical value. By studying these changes of 
connectivity one is able to classify the critical points of /(P) and to obtain the 
critical point relations. This method was used by Birkhoff in order to obtain 
his minimax principle and by Morse to obtain a complete set of critical point 

Received June 12, 1935. 

* Dynamical systems with tiro degrees of freedom, Transaction" of.the American Mathe¬ 
matical Society, vol. 18 (1917), pp. 199-300. 

* For references to literature on this subject see Morse, Calculus of Variations in the 
Large , Colloquium Lectures, American Mathematical Society, vol. 18 (1934). Unless 
otherwise expressly stated, all references to Morse are to his book. Sec also Morse and Van 
Schaack, Abstract critical sets, Proceedings of the National Academy of Sciences, vol. 21 
(1935), pp. 258-62. 

413 


510 



414 


G. D. BIRKHOFF AND M. R. HESTENE3 


relations. Lately Lefschetz has announced some further results, apparently 
using this method.* 

In the above method, the notion of minimum and maximum are seen to be 
special instances of a more general notion of the type of a critical point. The 
question naturally arises: cannot this process be reversed? Cannot all these 
critical relations be obtained by a minimizing principle? We propose to show 
in this paper that this can be done. 4 Our primary notion is essentially that of 
finding the minimum of the maximum P(C*) of /(P) on a suitably chosen class 
of Jfc-chains C*. It is for this reason we term our method the minimax method. 

It is not to be expected that we can treat the minimax principle completely 
in this paper. We have therefore chosen two of the simplest and most inter¬ 
esting topics. We study first the critical points of a non-degenerate function 
f( x i, * * • » ar ") ant * secondly the critical extremals of an integral J for the fixed 
end point problems in the Calculus of Variations. 

I 

The critical points of functions 

1. Hypotheses and definitions. The basic theory underlying the crit¬ 
ical point relations can best be illustrated by studying those of a function 
/(*», •••>*») defined over a region © in a euclidean space of points (x,, ... , x„). 
We suppose that /(x) is continuous and has continuous first and second deriva¬ 
tives on ©. 

A point P on © will be called a critical point of /(x) if the derivatives f Mi are 
all zero at the point P. If P is a critical point, then/(P) will be called a critical 
value of f(x). A critical point P will be called non-degenerate if the determinant 

I A.*/1 (*, j = 1, • • • , n) 

is different from zero at P. The negative type number of the quadratic form 

(*, j = 1, ••• , n) 

will be called the type or the index of P as a critical point of /(x). A function 
/ will be said to be non-degenerate if all of its critical points are non-degenerate. 
We shall assume that /(x) is non-degenerate. 

We suppose that there is a closed region <R interior to © containing all the 
critical points of / in its interior. The boundary B of <R is assumed to be a regu¬ 
lar manifold of class C", that is, one such that for each point P on B there is a 
neighborhood $ on B representable in the form 

v(x ii • • • , x») = 0, 

•Lefschetz, Application of chain deformations to critical points and extremals, Pro¬ 
ceedings of the National Academy of Sciences, vol. 21 (1935), pp. 220-221. 

4 ,f ee ° ur P t Pe . r in thG P . roceedin 8 s of the National Academy of Sciences, February 1935, 
pp. 90-99, with the same title, in which this principle was first formulated. Cf. Birkhoff 
Dynamical Systems, p. 135; and Morse, p. 272. 


511 



GENERALIZED MINI MAX PRINCIPLE 


415 


where *>(x) is continuous and has continuous derivatives of the first two orders 
and is such that the derivatives are not all zero at any point on 

On the boundary B of SR the normal derivative fs of f along the outer normal 
is well defined by virtue of the fact that there are no critical points of /(x) on B. 
We shall suppose that fs is positive on B. The case in which/// is not necessar¬ 
ily positive will be considered in §4 below. 

It is clear that there are but a finite number of critical points of / in 9? since 
non-degenerate critical points are necessarily isolated. We may suppose with¬ 
out loss of generality that if Pi, Pj are two distinct critical points of /(x), then 
/(Pi) j* /(P 2 ). If this were not so, we could alter / slightly in a neighborhood 
of Pi so that /(Pi) ^ /(Pj) without changing the type number of Pi or intro¬ 
ducing new critical points. Similarly we can modify / so that in a sufficiently 
small neighborhood of a critical point P of type k the function / takes the form 

/ = e — xj — x\ — • • • — 4* **+i + • • • + x l 

for a suitable choice of the coordinates (x). The proofs of these facts have been 
given by Morse and Van Schaack.* 

The cycles and chains here used are singular cycles and chains taken modulo 
2. The number R k of linearly independent non-bounding A>cycles on 9i in a 
maximal set is called the A;-th connectivity of 9?. 

2. Critical point relations. In this section wc shall give an intuitive proof 
of the following theorem due to Morse. 

Theorem 2.1. Let M k be the number of critical points of type k and R k the 
k-th connectivity of the region 9?. The following relations hold: 

Mo * Ro, 

Mi - Mo £ Ri - Ro, 

(2.1) M t - Mi + Mo £ Ro - Rx + Ro, 


M n — M n - 1 + + (— l)*Afo = R* — Rn- » + •** + (“ l) n Po- 

The first inequality Mo ^ Ro in (2.1) is, of course, well known to all students 
of mathematics. It follows from the fact that /(x) has at least one minimum 
point in each of its Ro connected pieces. We can obtain this result in another 
way which is more complicated but which has the advantage that it can be 
generalized so as to obtain the remaining relations (2.1). Let C 0 be a O-cycle 
on 9? and P(C 0 ) the maximum value of /(x) on C 0 . The inequality A/ 0 ^ Ro 
can be obtained by finding the number of non-equivalent O-cycles which afford 
a relative minimum to the functional P(Co). We generalize this method as 
follows. Let F(Ck) be the maximum value of /(x) on a A:-cycle or a A-chain C*. 

‘ The critical point theory under general boundary conditions, Annals of Mathematics, 
vol. 35 (1934), pp. 547-550. 


512 




416 


G. D. BIRKHOFF AND M. R. HESTENES 


If the class of admissible i-chains C* is suitably chosen, the number of non¬ 
equivalent ^-chains which afford a minimum to F(Ck) is equal to the number 
Mk of critical points of type k, and the remaining inequalities ( 2 . 1 ) follow read¬ 
ily. This section will be devoted to developing these ideas. 

We begin with the O-cycles C 0 on 9£. We admit only such O-cycles Co as are 
non-bounding on the domain / ^ F(C 0 ). Consider now an admissible O-cycle 
C 0 . If we deform C 0 continuously without increasing F(C 0 ), we finally obtain 
a O-cycle C' 0 which cannot be further deformed into a O-cycle on / < F(C' 0 ) with¬ 
out increasing F(C'o) and which is such that/(P) = F(Cj) at only one point on 
C 0 . Such a cycle will be called a minimum O-cycle. The point on C' 0 at which 
f(P) = F(C' 0 ) is clearly a minimum point of f(x). Let us denote by 9* 0 the class 
of all minimum O-cycles on 9?. Each zero cycle in 9fi> determines a unique 
critical point P of /(x) on 9?. But to every minimum point P there will cor¬ 
respond in general infinitely many O-cycles C 0 having f(P) = F(C 0 ). These 
O-cycles must be considered as equivalent if they arc to be used as a count of the 
minimum points of f(x). We accordingly define two O-cycles C 0 , C' 0 in 9h> to be 
equivalent if F(Co) - P(Cj) and their sum C 0 + C' 0 is homologous on / £ F(C 0 ) 
to zero or to the O-cycles on / < F(C 0 ). It is clear that for the case here con¬ 
sidered two O-cycles Co, C' 0 in 9?o are equivalent if and only if F(Co) = F(C'o). 
Hence the number N 0 of non-equivalent O-cycles in 9t 0 is equal precisely to the 
number M 0 of critical points of minimum type. Moreover, 

(2.2) M 0 = Wo £ Ro, 

since every non-bounding O-cycle on 9i can be admissibly deformed into a 
O-cycle in 9h>. This proves the first inequality (2.1). For reasons which will 
appear later it is convenient to refer to the class 9I 0 of O-cycles as the class 
of O-chains and to denote by M 0 the number of non-equivalent O-chains in 9ft 0 . 

In order to extend the method described above so as to obtain further critical 
point relations, let us first examine the nature of a critical point P of type 1. 
We shall suppose that the coordinate system (x) has been chosen so that (x) = (0) 
at P and that the function /(x) takes the form 

/ - c - x\ + x\ + ... + xl 

in a neighborhood of about P. Consider the 1-chain C, defined by the rela¬ 
tions 

( 2 -3) *\^h, + ... _ o. 

For A sufficiently small the 1-chain C t will lie in and moreover the maximum 
value F(Ci ) of f(x) on C, is attained only at the critical point P. Now it may be 
possible to join the end points of the arc C, by a second arc C[ on the domain 
/ < /(P). The closed arc C',' = C, + C[ then forms a 1-cycle on 9? which can¬ 
not be deformed continuously into a 1-cycle on / < F(C”) without increasing 
F(C,). Moreover,/(Q) = F(C,) at a point Q on C" only in case the point Q 
is coincident with the critical point P. Thus we sec that the cycle C" can be 


513 



GENERALIZED MINIMAX PRINCIPLE 


417 


considered to be a minimizing 1-cycle for the functional F(Ci). The cycle C'[ 
is clearly a linking cycle in the Morse sense.® 

If the boundary of the 1-chain C x defined by (2.3) is not homologous to zero 
on the domain / < F(Cj), clearly it is homologous on this domain to a O-cycle 
Co in the class 9?o of minimizing O-cycles for F(C 0 ). Let C[ be the 1-chain join¬ 
ing the ends of C x with the O-cycle Co. The 1-chain C[ = C\ -f- C' x has proper¬ 
ties analogous to those of the 1-cycle C" described in the last paragraph. For 
example, the 1-chain cannot be deformed continuously into a 1 -chain on / < 
F{C'x) without increasing F{C”) or F(C 0 ). Moreover, the only point Q on C" at 
which f(Q ) = F(CV) is the critical point P under consideration. It should be 
noted that for the 1-chain C" here constructed the boundary C 0 is not homol¬ 
ogous to zero on the domain / < F(C','). Here again C" appears as a mini¬ 
mizing 1-chain for the functional F(Ci). 

In view of the above remarks it would seem that one should be able to obtain 
all the critical points of f(x ) of type 1 by minimizing F(Ci) on a suitably chosen 
class of 1-chains. This is indeed the case. We admit, for obvious reasons, 
only a special class of 1-chains, which we shall term admissible l-chains. A 
1-cycle Ci will be admitted if it is non-bounding on the domain / £ F(Ci). A 
1-chain possessing a boundary will be admitted if its boundary is in 9I 0 . In order 
to find the minimizing cycles for F(Ci), wc associate with each admissible 
1 -chain Ci a set of deformations, called admissible deformations , which never 
increase F(Cj); nor do they increase F(C 0 ) if the boundary Co of Ci exists. These 
deformations are to be continuous, except for the fact that they may subdivide 
Ci into a finite number of parts subject to the following restrictions. The 
image of a 1-cycle Ci under a deformation must be homologous to C x on the 
domain / £ F(C X ). The image C[ of a 1-chain Ci having a boundary C 0 must 
be related to C x as follows. Let C' 0 be the boundary of C[ and let C' x be a 1-chain 
on / ^ F(Co) bounded by the O-cycle Co + C ' 0 . The 1-cycle Ci + Cl + C" must 
be homologous to zero on the domain / £ F(C X ) if our deformation is to be 
admissible. 

Consider now an admissible 1-chain Ci. By means of an admissible deforma¬ 
tion we can deform C x into a 1-chain Cl which cannot be further deformed into a 
1-chain on the domain / < F(C,') and which has/(Q) = F(C1) only at the points 
Q on C\ which arc on a single point P on 9f. Tl\c point P can be shown to be a 
critical point of type 1. The class of all l-chains having the same properties 
as Cl will be denoted by 3W|. Each 1-chain C x in S0?i has associated with it a 
unique critical point P of type 1 having f(P) = F(Ci). However, to each 
critical point P of type 1 there corresponds in general infinitely many l-chains 
(\ in 9Wi having F(C X ) = f(P). These l-chains must be regarded as equivalent 
if they are to be used a* a count of the critical points. Hence we shall agree to 
call two l-chains C,, C[ in 9Wi equivalent if F(C t ) = F(C[). This definition of 
equivalence is not sufficiently general to be applicable in the case for which 
there may be more than one critical point corresponding to each critical value. 

• Cf. Morse, p. 158. 


514 



418 


G. D. BIRKHOFF AND M. R. HESTENES 


We shall accordingly say that two 1-cycles C x , C\ in are equivalent if F(C i) = 
F(C I) and their sum C x -f C\ is homologous on / ^ F(C X ) to zero or to the 1- 
cycles on the domain / < F(C X ). Two 1-chains C Xt C[ in 9Wi not both 1-cycles 
will be said to be equivalent if F(C X ) = F(C[), if the boundary of their sum 
bounds a 1-chain C" on / < F(C,), and finally if the 1-cycle C, + C [ + C'[ is 
homologous on / ^ F(C X ) to zero or to the 1-cycles on / < F(C X ). It will be 
seen in the next section that our two definitions of equivalence are the same 
for the case here discussed. It follows that the number of non-equivalent 
1 -chains in 97?i in a maximal set is equal precisely to the number M x of critical 
points of type 1. 

The number of non-equivalent 1-chains in 9Wi can be evaluated in a second 
way. To do so let us denote by 9b the class of all 1-cyclcs in 9Jb and the maxi¬ 
mum number of non-cqni valent 1-cycles in 9b by N x . It is clear that every 
non-bounding 1-cyclc in can be deformed by an admissible deformation into 
a 1-cycle in 9b- Hence wc have N x ^ R Xl where R x is the linear connectivity 
of 9?. Moreover, there arc exactly A’ 0 - R 0 non-equivalent bounding O-cycles 
in 9?o in a maximal set. I^et Co be one of these and let b be the greatest lower 
bound of the values of F(C X ) on the 1-chain C x bounded by C 0 . Clearly any 
1 -chain C x with F(C X ) near 6 can be deformed admissibly into one for which 
F(C X ) *= 6. The resulting chain is in 9Wi and is bounded by C 0 . There arc 
accordingly No — R 0 non-equivalent 1-chains of this typo. Moreover, no 
linear combination of these 1-chains is equivalent to a 1-cycle in 9b. It fol¬ 
lows readily that there arc exactly N x + N 0 - R 0 non-equivalent chains in 9)b. 
But the number of non-equivalent chains in 9Jb was seen above to be equal to 
the number A/, of critical points of type 1. Hence we have 

(2.4) A/, £ AT. £ R x , M x - N x . N 0 - R 0 . 

The second relation (2.1) follows readily from the second relation (2.4) by re¬ 
placing No by A/ 0 and N x by R x . 

The above arguments can be extended inductively to any dimension. We 
suppose that the classes 9N*, 9?* of minimizing A-chains and A-cycles for F(Cu) 
have been constructed for k - 0, 1, ••-,/- 1. For these values of k the 
numbers A/*, N k of non-equivalent ^-chains in SJh, 9?* respectively satisfy the 
relations 

(2.5) M k ^ N k £ R k , Mu - N k - A r *_, - Ru- X {k = 1, 1) . 

The proof that these relations hold for k = j can be established by precisely 
the same arguments as those made for k = 1. Wc merely need to make the 
obvious changes such as replacing 1 by j and 0 by j - 1 whenever they occur. 
For this reason we shall give here only the important definitions and ideas used 
in the development. 

In order to establish the relations (2.5) for A; = j we minimize F(Cy) in the 
class of admissible /-chains. A /-cycle C, will be termed admissible if it is non¬ 
bounding on the domain / ^ F(C>). A /-chain C> having a boundary C ,_ x will 


515 



GENERALIZED MINI MAX PRINCIPLE 


419 


l>e termed admissible if Cy_i is in 9?,_i. An admissible deformation is one 
which deforms an admissible /-chain C, without increasing F(Cy), and which 
deforms tlie boundary of Cy admissibly. An admissible deformation is to be 
continuous except that we admit the possibility of Cy breaking up into a finite 
number of pieces subject to the following restrictions. If Cy is a cycle, its image 
under an admissible deformation must be homologous to Cy on the domain 
/ ^ F(Cy). If Cy is a/-chain bounded by a (j— l)-eyelc Cy_i in 9?,_i, the image 
C\ of Cy and the boundary C'_, of Cy must be related to Cy, Cy_i as follows. Let 
C/ be a/-chain on / ^ F(Cy_0 bounded by the (/ — l)-eycle Cy_i -f Cj_,. The 
/-cycle Cy -f- C' -f C'/ must be homologous to zero on the domain / % F(Cy). 
We now define a/-chain Cy to be in the class 97?, if Cy cannot be admissibly de¬ 
formed into a /-chain on / < F(C,), and if f(Q) = F(Cy) only at the points Q 
which are on a single point F on 91. The point P can be seen to be a critical 
point of type/. Two /-cycles Cy, C \ in 9Wy will be called equivalent if F(Cy) b 
F(C y) and the/-cycle Cy + Cy is homologous on / ^ F(Cy) to zero or to the 
/-cycles on / < F(C,). Two/-chains Cy, Cy which are not both /-cycles will be 
called equivalent if F(C,) = F(C'), if the boundary of Cy + Cj bounds a/-chain 
C" on / < F(Cy), and finally if the /-cycle Cy + C\ + C" is homologous on 
/ ^ F(Cy) to zero or to the /-cycles on / < f (Cy). For the case here considered 
two/-chains Cy, C' in 9)?/ can be seen to be equivalent if and only if F(Cy) - 
F(C,). The number of non-equivalent /-chains in 9)?, in a maximal set is there¬ 
fore equal precisely to the number A/y of critical points of /(x) of type/. Let 9?y 
be the class of all/-cycles in A/y, and N, Ik* the maximum number of non-equiva¬ 
lent /-cycles in 9?,. Since each non-bounding /-cycle on 91 can be deformed into 
one in 9?>, we have X, ^ R t . Moreover, one can see as in the case / = 1 that 
there are exactly A’y_i — /?,•_i non-equivalent /-chains in 97?„ no linear combina¬ 
tion of which is equivalent to a /-cycle in 9?,. The number A/y of non- 
equivalent /-chains in 9)1, is accordingly equal to A r y + A',_i — F/-i- The re¬ 
lations (2.5) accordingly hold for k = /. From the relations (2.5) and (2.2) it 
follows readily that 

A/* - A/a— i + ••• + (— l)*A/ 0 = AT - R a— i + Rt-t - ••• + (-D k Ro, 

M a - A/*_i + ... + (—l)*A/ 0 ^ Rl - Rk—i + ••• + (-D k Ro, 

the equality holding if and only if AT = Rk. From the relations (2.5) it is clear 
that AT = Rk if either Mk+i = Rt+i or A/* = Rk. In particular AT = R*, as 
one readily verifies. 

Thus we see that all critical point relations can be obtained by a minimax 
principle. We minimize the maximum F(C a) of f(x) on the fc-chains of a suit¬ 
ably chosen class. In order to complete the arguments made above we must 
establish the following statements. 

I. Every k-cycle C\ on 91 which is non-bounding on f ^ F(C*) can be deformed 
admissibly into a k-cyclc in 91*. 

II. Every k-chain (\ on 91 whose boundary is in 9?*_i can be admissibly de¬ 
formed into one having an equivalent boundary. 


516 



420 


G. D. BIRKHOFF AND M. R. HESTENES 


III. The points on a k-chain C k in SJlt at which /(P) = F{C k ) correspond to 
a single critical point P of /(r) of type k. Two k-chains C k , C k in SW* having 
F(Ck) = F(C k ) are equivalent. Each critical point P of type k has associated 
with it at least one k-chain C k in SW* such that P is on C k and f(P) = F(C k ). 

These statements will be established in the next section. 

3. Proofs of the above three statements. The proofs of the three state¬ 
ments made at the end of the last section depend on two lemmas, the first of 
which is the following 

Lemma 3.1. Every admissible k-chain C k on <R can be admissibly deformed 
into a k-chain C k such that the points on C k at which f = F(C' k ) are in an arbi¬ 
trarily small neighborhood V of a critical point P and such that the boundaries of 
C k and C' k , if they exist, are equivalent. The points of C* not in arc on the 
domain f < /(P). 

To prove this theorem we use the orthogonal trajectories to the hypersurfaces 
f = constant. These trajectories are the solutions of the differential equations 

dx,/dt =/,. (t - 1, ... , n) , 

and are well defined except at the critical points of /. Through any ordinary 
point of / there passes one and but one of these trajectories. 

We now define a deformation A*. To do so, we consider a A-chain C*. We 
suppose that there are no critical points P on C k at which f(P) -= F(C *). Let 
6 be a value of/such that each point P of C k on the domain / ^ b can be joined 
to a unique point P' on / = b by means of an orthogonal trajectory. As the 
time t varies from 0 to 1, a point Ponf > b moves towards P' along the trajec¬ 
tory PP' at a rate equal to its initial length. The points on / £ b are held fast. 
We term this deformation A*. It carries C k continuously into a A*-chain C' k 
on the domain / ^ 6. The boundary C*_, of C k , if it exists, is unaltered since b 
clearly exceeds P(C *_ x ). 

A second deformation A* can now be defined as follows. Let 6 be a critical 
value of f{x) corresponding to a critical point P. The orthogonal trajectories 
through points on / = 6 at distances p, 2p from P together with the surfaces 
/ = 6 =fc « form tubular neighborhoods T, t T 7 of P which lie within any pre¬ 
scribed neighborhood ^ of P, provided the positive constants p, < are taken 
sufficiently small. We deform points outside T 7 according to the deformation 
A*_.. Points inside T x are held fast. A point Q in 7\ but not in T x is deformed 
as follows. Let P', Q' be the points of / - 6, / - 6 - respectively, on the 
orthogonal trajectory of / = 6 through Q. The point Q moves towards Q' 
on this trajectory at a rate equal to (d - p)/p times the iength of the trajectory 
QQ', where d is the distance from P' to P. The deformation so defined is ad¬ 
missible and will be denoted by A^. 

The deformation described in the lemma can be made by applying successively 
deformations of the type A* and A 6 . The lemma is established. 

Our second lemma deals with deformations in a neighborhood of a critical 
point P of type A*. We may suppose that the coordinate system (x) has been 


517 



GENERALIZED MINIMAX PRINCIPLE 


421 


chosen so that P is the point Cr) = (Q) and that in a neighborhood of P the 
function/(x) takes the form 

(3.1) / « b - xf - x \ - • • • - rl + rU % + * • • + *1 • 

I.et S, be a spherical neighborhood 

x \ + ... +xl g p 7 

of P within We are especially interested in the Ar-chain 

(3.2) x \ + ... + x \ £ P* , *Li + •••+*!- 0, 

which we shall denote by C*. Its boundary is denoted by C*_i. A /-chain C\ 
in S, with its boundary Cj_i on / < b will be said to be equivalent to C*4f/ = k 
and if C*_i is homologous to C*_i in S, on / <6. 

Lemma 3.2. Let C\ be a j-chain interior to S, whose boundary C,_, is on the 
domain f < b. If C] is not equivalent to (\, then C t can be admissibly deformed 
on S, into a j-chain C", on f < b having the same boundary. If C\ is equivalent 
to Ci, then C f cannot be so deformed , but can be deformed into a k-chain C" k on 
f £ b having the same boundary and having f = 6 only at the points on Ci which 
arc on the critical point P. 

To prove this, choose t so small that b — 9«* > F(Cj_i), and let S„ S Jt 
be three spherical neighborhoods of P of radius t, 2e, 3t respectively. By means 
of the deformation we may deform C\ admissibly so that all points of Cj 
which are on the domain / £ 6 — «* lie in S*,. We now make the following 
deformation. 7 The points on C> outside and on the boundary of the sphere 
.Sa* are held fast. Inside the sphere S*. the coordinates x, t • • • , x* of a point (x) 
are held fast. Inside the coordinates x,(j > k) move towards zero as the 
time t varies from zero to 1 at a rate equal to the absolute value of the initial 
value of X /. In the region between St, and St, the coordinates Xj(j > k) move 
towards zero at a rate equal d/t times the absolute value of the initial value of 
X|, where d denotes the distance from the point (x) to the boundary of St,. 

Under the deformation just described the/-chain C, is admissibly deformed 
into a/-chain C, on f ^ b and having the same boundary. The portion of C, 
which is in S, lies on the A-chain 

(3.3) 0 £ x \ + ■■■ + xl £ c\ xU , + ...+*!- 0 

by virtue of our deformation. Suppose now that C\, and hence also C7, is not 
equivalent to C*. If F{C\) < b, our lemma is established. Hence we need 
to consider only the case F(C") = 6. In this case we may suppose that C'■ is 
divided finely enough so that there is a (/ — l)-eycle C,-i which is composed 
of cells of Cf on the domain 

(3.4) + - 0, 

1 Cf. Morse, p. 170. 


518 



422 


G. D. BIRKHOFF AND M. R. HESTENE8 


and which together with the boundary of C" bounds a /-chain y, on f < b com¬ 
posed of cells of C f . The (j — l)-cycle Cj-t accordingly bounds the/-chain 
Cj — C 7 + *yy on (3.3). Now the connectivities of the domain (3.4) are those 
of a (A; — l)-sphere, as will be seen below. It follows that C/_» bounds a/-chain 
y'j on the domain (3.4), since by construction it is not homologous to the bound¬ 
ary of (3.3). We now think of C" as being equal to the sum of the chains y\ = 
yi + y'i an< ^ yi + The second of these chains is a /-cycle on the domain 
(3.3) and can be deformed into the null /-cycle in many different ways without 
increasing F(C /). Thus we see that in this case C*j can be deformed into a 
/-chain y] on the domain f < b. The lemma is therefore true as stated for the 
case in which C\ is not equivalent to C k . If C\ is equivalent to C' k} then C" 
can be constructed as before. In this case we must have F(C") = b, since 
otherwise the boundary of C* would be homologous to zero in S, on the domain 
/ < b, which is not the case. For the same reason C'j cannot be deformed into 
a /-chain on / < b with the same boundary. The last statement in the lemma is 
accordingly true. 

The proof of the lemma will be complete if we show that the connectivities 
of the domain (3.4) are those of a (k — l)-sphere. This is immediate, since 
we can deform the domain (3.4) into the (k — l)-sphere 

(3.5) ** + • • • + *2 - «*, x * k +1 + «0 

as follows. As the time t varies from 0 to 1 let each point P move along the 
trajectories through P toward the point Q on (3.5) on the same radius at a rate 
equal to the initial length of the radius PQ. 1 This completes the proof of 
Lemma 3.2. 

Statements I, II, and the first part of III made at the end of the last section 
follow at once from Lemmas 3.1 and 3.2. The last statement in III follows 
immediately from the fact that the boundary C*_, of the &-chain C* given by the 
relation (3.2) is homologous on the domain / < /(P) to zero or to a cycle Ci_, 
in Let C' k be a fc-chain on / < /(P) bounded by C*_, or by C*_i + C*_, 

as the case may be. The chain C k - C k + C' k is clearly a A:-chain in W* having 
f(P) - f(C k ), and the last statement in III is established. 

It remains to prove that two ^-chains C k , C k in 2W* having F(C k ) = F(C' k ) 
are equivalent. To do so let P be the critical point of type k on C k and C* 
such that f(P) = F(C k ) = F(C' k ), and suppose that / is in the form (3.1) in 
a neighborhood <0 of P. We construct the region S 0 and the A>chain (3.2) 
as before, and denote the chain by C* and its boundary by C*_,. If the chain 
C k is finely enough divided, there is a Ar-chain y' k composed of cells of C k in S, 
which is equivalent to C k in the sense described in the paragraph preceding 
Lemma 3.2. Otherwise the A:-chain C k could be admissibly deformed on the 
domain / < /(P) by Lemma 3.2. Similarly there is a A*-chain y " k composed of 
cells of C: which is equivalent to C*. The boundaries of y k , y' k 

are homologous on / < /(P) in S, to the boundary C*_, of C*, and hence homol- 


519 



GENERALIZED MINIMAX PRINCIPLE 


423 


ogous to each other on the same domain. Let y k be the A-chain bounded by 
y'k-i + y'k -1 in 5, on / < /(P). The A-chain C’ k 4- C'i can be written in the 
form 

C' k + Ck = (y'k 4- yl 4- 7*) 4- (C* 4- C' k 4- 7* + 7* + 7*) • 

The A-chain in the first parentheses is a A-cycle which is homologous to zero 
on the domain S,. The A-chain in the second parentheses is a A-chain C k on 
the domain / < f(P), and hence is homologous to the A-cycles on this domain if 
C'k, C'i are both A-cycles. If C' k , C'i are not both A-cycles, then Ck is a A-chain 
on / < F(C' k ) bounded by the boundary of the chain C' k 4 - C*. Moreover, the 
A-cycle C'k 4- C'i + Ck is clearly homologous to zero on / ^ F(C' k ). Thus we 
see that in cither case C' k and C" k are equivalent. This completes the proof 
of Theorem 2.1. 

4. General boundary conditions. In the above treatment we assumed 
that the normal derivative fs along the outer normal is positive on the boundary 
B of 9?. This assumption was made only for convenience. If this assump¬ 
tion does not hold, some of the admissible A-chains C* will, in general, be de¬ 
formed by an admissible deformation into the region A of B where /* is nega¬ 
tive. If this A-chain C* is deformed so that F(C k ) is a minimum and so that 
/(P) = P(C*) at essentially one point on Ck, this point will not be a critical 
point of f(x) in the usual sense, but will be a critical point of the function 
q(\/\, • • • , y*»— i) defined by f(x) on the region A. It is clear, therefore, that 
if the relations ( 2 . 1 ) are to hold in this case we must also include the critical 
points of g(y) on A as well as those of f(x) on 9?. Hence in this case we define 
Mk to be the number of critical points of type k of f(x) on 9? plus the number 
of type k of g(y) on A . The remainder of the proof is as before. We assume, 
of course, that the critical points of g(y) are non-degenerate and that to each 
critical value of / and g there corresponds but one critical point of / or g. The 
results here given can also be extended at once to non-singular functions de¬ 
fined on a closed Riemannian manifold. These cases have been discussed by 
Morse and Van Schaack* using a different method. 

5. The degenerate case. The methods used above can readily be ex¬ 
tended to the case in which critical points of f(x) arc degenerate. We shall 
only briefly indicate how this can be done. The only difficulty which arises 
is that of deforming a A-chain down onto a critical point or a set of critical points. 
If this can be done, the arguments can be made as above, and we define the 
count of the critical points of type k to be equal to the number Af* of non¬ 
equivalent A-chains in 9J?*. The number Mk so obtained will necessarily satisfy 
the conditions (2.1). 

If an admissible A-chain cannot be deformed down onto a set of critical points, 
we can modify our procedure somewhat and define the A-cycles and A-chains 

1 I.oc. cit., footnote 2. 


520 



424 


G. D. BIRKHOFF AND M. R. HESTENES 


in 9J?*» to be those whose maximum points are in an «-neighborhood of a set 
of critical points at which / = b but cannot be admissibly deformed into a A- 
chain on / <6. Under a suitable definition of equivalence of cycles and chains 
in 9 Jlk* it can be shown by methods analogous to those used above that the num¬ 
ber Mk of equivalent A-cycles and A-chains in 9ft*. is independent of the par¬ 
ticular choice of «, for « sufficiently small, and satisfies the relations (2.1). In 
the non-degenerate case the number M k so defined is equal to the number of 
critical points of type k. 

The treatment here given is in the large. But the same methods can be 
applied to neighborhoods of sets of critical points. We obtain thereby a char¬ 
acterization of sets of critical points in the small, which is independent of the 
particular neighborhood used, at least in the most important cases. 

II 

Generalized minimax principle 

The methods used in the last part can be given an abstract formulation which 
brings out the essential features of the method. The results of this section 
were published recently by the authors in a somewhat different form.® 

6. Generalized minimax principle. Consider now a space U on which 
the ordinary concepts of topology, such as A-cycles, A-chains, non-bounding 
A-cycles, addition, homologies, etc., are well defined (modulo 2). The num¬ 
ber R k of independent A-cycles in a maximal set of such cycles is called the A-th 
connectivity of «. We admit the possibility of R k being infinite. 

Suppose now that we have given a functional /(P) which is well defined for 
all points P on Q. We shall denote by F(C k ) the least upper bound of the values 
/(P) on the chain C*. 

Our critical point relations will be obtained by minimizing the functional 
F(C k ) on a suitably chosen set of A-chains. A A-cycle C t will be admitted only 
in case C k is not homologous to zero on the domain / £ F(C*). We associate 
with each admissible A-cycle a set of deformations, called admissible deforma¬ 
tions, which never increase F(C,) and which deform C* into a A-cycle which is 
homologous to C* on the domain / ^ F(C k ). A minimum A-cycle C k is defined 
to be one which cannot be deformed admissibly on / £ F(C*) into a A-cycle 
on the domain / < F(C k ). The class of all minimum A-cycles will be denoted 
by 9?,. Two A-cycles C*, C' k in 91* will be called equivalent if F(C k ) = F(C[.) 
and their sum is homologous on / ^ F(C*) to zero or to the A-cycles on the 
domain / < P(C*). 

A A-chain C* bounded by (A - l)-cycle C*_, will be admitted if its boundary 
Ck -1 is in 9?*_i. We associate with such a A-chain C* a set of deformations, 
called admissible deformations, which never increase F(C*), which deform its 
boundary C*_, admissibly, and which deform C* into an admissible A-chain C* 

* Loc. cit. 


521 



GENERALIZED MINIMAX PRINCIPLE 


425 


related to C* as follows. Let C' k -i be the boundary of C k , and let C* be a 
A>chain on the domain / ^ F(C*_i) bounded by the A:-cycle C*_i + C' k - X . The 
fc-cycle Ck + C' k + C" k must be homologous to zero on / ^ F(C*), if our def¬ 
ormation is to be admissible. 

By a minimum jfe-chain C k will be meant one which cannot be admissibly de¬ 
formed on the domain / ^ F(C*) into a fc-chain on / < F(C*). The class of all 
minimum ^-chains and A-cycles (k > 0) will be denoted by 97?*. We set 27? 0 — 
9? 0 . Two /c-chains C k , C' k , in 21?* but not both in 9?*, will be called equivalent 
if F(C k ) = F{C’ k ), if the boundary of their sum bounds a *-chain C*on/ < F(C k ), 
and if the £-cycle C* + C' k + C" k is homologous on / ^ F(C k ) to zero or to the 
fc-cycles on / < F(C k ). 

We make the following assumptions. 

I. Every admissible k-cycle C* on SI can be admissibly deformed into a k-cyclc 
in 2?*. 

II. Evei'y admissible k-chain ( k > 0 ) on SI can be admissibly deformed into a 
k-chain in 27?* having the same or an equivalent boundary. 

III. The value F(C*) is attained by /(P) on each k-chain C* in 21?*. 

The following lemma is immediate. 

Lemma 6.1. Let A/*, N k be respectively the number of non-equivalent k-chains 
in 21?*, 2?*, and R k the k-th connectivity of SI. If the numbers M k (k = 0, 1, • • • , m) 
(m arbitrary) are finite, under assumptions I, II the following relations hold. 

(6.1) M k £ N k £ R k £ 0 (* - 0, 1, ... , ro), 

(6.2) A/ 0 - No, Mi - N t - Ni-i - R/-i (J = 1, • • • , m), 

(6.3) M k - Mk-x + •••+(-l)*A/ 0 - N„ - Rk-x + Rk-t + •••+(-l)*flo, 

(6.4) M k - Mk-x + ••• + (-l)*A/o £ Rk - Rk-x + ••• + (-l) k Ro, 

the equality holding if and only if N k = Rk. If A/* = R k , or if M k +x = Rk+x, 
then Nk = Rk. If R k is infinite, so also are A/* and N k . 

Let Ck be a A;-chain in 27?*. Let us deform C* by an admissible deformation 
which diminishes / at all points at which it is possible to do so. A point P 
on the new Ar-chain C* at which f(P) = F(C*) will be called a critical point of 
type k. The count of the critical points of type k is defined to be the number 
A/* of non-equivalent fc-chains in 27?*. We have the following theorem. 

Generalized Minimax Principle. Under hypotheses I, II, and III the critical 
'points of type k exist and their counts A/* salisfy the relations (6.4). 

The minimax principle here given involves an ideal set. It can be realized 
at least in the non-degenerate cases considered in this paper. It is not clear 
that it can be realized in the general degenerate cases. The chief difficulty 
which arises is the construction of the classes 27?* and 2?*. However, these 
classes can in general be approximated by classes 27?*„ 9?*„ as suggested in 
§5 above. 


522 



426 


G. D. BIRKHOFF AND M. R. HESTENES 


III 

The fixed end point problem in the Calculus of Variations 

One of the numerous applications of the minimax principle described in the 
last section is to the fixed end point problem in the Calculus of Variations. 
In this case we are interested in the existence and the classification of the ex¬ 
tremals of an integral of the form 

J - £ /(*, *) dt 

which join two fixed points A x and A t on a Riemannian manifold 9?. We shall 
discuss only the non-degenerate case. 

7. Hypotheses and definitions. We assume that /(x, x) is a positive ana¬ 
lytic function for all points (x, x) with (x) on 9? and (x) * (0). We assume 
further that for these values of (x, x) 

f(x, kx) = kf(x, X ) (k > 0 ), 

and that/(x, x) is positively regular, that is, 

fmk TiT k > 0 (t = 1, ... , n) 

for all (r) (px). 

An arc 10 of class D' which joins the two fixed points A x and A 2 will be called 
an admissible arc. The totality of admissible arcs will be defined as our space n. 
The arguments here given hold equally well in case the space U is taken to be 
the totality of admissible arcs for which J g b, where 6 is an arbitrary fixed 
constant. 

We introduce a Frichet distance d(E u E,) between the arcs E, and E, on ti 
as follows. The geodesic distance beween points P, and P, on 9i is defined to be 
the greatest lower bound of the lengths of the arcs on 91 joining P, and P,. Let 
H be a homeomorphism between the arcs E, and E, preserving sense and let J(//) 
be the maximum geodesic distance between corresponding points under II. 
We now define d[E u E,) to be the greatest lower bound of i(//) for all sense¬ 
preserving homeomorphisms H between E, and E„ Cycles and chains on 12 are 
to be defined in the manner described by Morse." 

8. Critical extremals. An extremal is a solution of class C " of the Euler 
equations 

/,. - (d/dO/i. = 0 (t - I, ... ,n). 

•• An arc » x*(0 (/, ^ l £ t 9 ) will be said to be of class D' if it is continuous and is 
composed of a finite number of sub-arcs on each of which *•(!) have continuous derivatives 
not all zero, for a suitable choice of the parameter t. 

“ Morse, pp. 193-5. 


523 



GENERALIZED MINIMAX PRINCIPLE 


427 


A critical extremal is defined to be one which joins the two given points A, and 
At. The value of J along a critical extremal will be called a critical value of J. 

In the analytic case here considered there are at most a finite number of critical 
values of J less than a given constant b. A proof of this fact has been given 
by Morse. 13 

A critical extremal E will be said to be non-degenerate if its end points are 
not conjugate. We shall assume that the critical extremals are all non-degener¬ 
ate. In this case critical extremals are isolated and there are but a finite num¬ 
ber of extremals corresponding to each critical value. 1 * 

The type number of a non-degenerate critical extremal E is defined to be the 
sum of the orders of the conjugate points on E of the initial point A. We 
shall prove the following 

Theorem 8.1. Let A/* be the number of critical extremals of type k and R k 
the k-th connectivity of fi. For the values A/*, Rk that are finite the relations 

A/* ^ Rk, 

M k - A/*., + ... + (— l)*A/ 0 ^ Rk - Rk-x + ••• + 

are true. The equality in the last expression holds in case either M k = R k or 
A/jr+i = Rk+\. If Rk is infinite, M k is infinite. 

This theorem will follow at once from the generalized minimax principle of 
§6 if we establish the following facts. The classes 91?*, 9?* here used are defined 
as in §6. We denote the maximum value of J on a chain C* by J (C*). 

I. Every k-cycle C* which is non-bounding on the domain J ^ «/(C*) can be 
admissibly deformed into a k-cycle in 9?*. 

II. Every k-chain C* whose boundary is in 9?*_i can be admissibly deformed 
into a k-chain in 9)?* having the same or an equivalent boundary. 

III. The number of non-equivalent k-chains C* in 9)?* having J{C k ) = b is equal 
to the number of critical extremals E of type k having J(E) = b. 

9. The space In order to prove the three statements made at the end 
of the last section it is convenient to study first particular sub-spaces of 
which we denote by and which we shall now define. 

We term the value of J along a curve E the J-length of E. Let p be a constant 
so small that every extremal of ./-length ^ 2p affords a proper minimum to J 
in the class of all admissible arcs joining its end points. An extremal segment 
of ./-length ^ p will be called an elementary extremal. This terminology is due 
to Morse. 

The space is now defined as the totality of curves in composed of at most 
m 4- 1 elementary extremals. The end points of the successive elementary 
extremals on an arc E in S?-, form a sequence 

(9.1) Po - A lt P Xt • • • , P«, P-. + i = At, 

** Loc. cit., p. 199. 

** Morse, p. 230. 


524 



428 


G. D. BIRKHOFF AND M. R. HESTENES 


which we call the vertices of E. We admit the possibility of successive vertices 
being coincident. 

Lemma 9.1. Let C* be any k-chain on SI. If m is sufficiently large, then Ci 
can be admissibly deformed into a k-chain on Sl m . 

To prove this, divide the interval 0 ^ t 1 of the parameter t of the arcs on 
Cit into m 1 equal segments For m sufficiently large, the points P it P\ 

on any arc E of C* determined by values t„ f' (£,' on (.*.+,) can be joined by an 
elementary extremal Ei . Assign to a point P on E, the parameter value t which 
divides <»•<,•+1 in the same ratio as P divides E t . Let t\ vary continuously from 
ti to U+i, the arc P<P, on E being replaced by the corresponding extremal Ei. 
The chain C* is then deformed admissibly into a fr-chain on Sl m , as one readily 
verifies. The deformation here used is due to Morse (p. 205). 

Thus we see that we can restrict ourselves for the most part to the study of 
^-cycles and ^-chains on U m . 

The following lemma is useful in establishing the existence of extremals. 

Lemma 9.2. The space Sl m is compact. 

For let t En | be a sequence of curves in and let 


(9.2) 




be a set of vertices on the arcs E n . This sequence of vortices has at least one 
accumulation set (9.1) such that the points P % , P i+l can be joined by an ele¬ 
mentary extremal. The vertices (9.1) determine an arc E in Sl m , which is clearly 
an accumulation curve of the sequence {£*}. The lemma is therefore true. 


10. The deformation A. Our principal deformation which we shall denote 
by A can be defined as follows. Let E be an arc in with vertices (9.1). As 
the time t varies from 0 to 1/2, points Q,(» - 0, 1, ... , m) move on E from 
the points P, towards P.+, at a 7-rate' 4 equal to the 7-length of P.P.+,. The 
vertices 

Qo, • • • , Q m , At 

determine a curve in which varies continuously from the curve E to a curve 
E as t varies from 0 to 1/2. As t varies from 1/2 to 1 let the points P\(i = 
1, • • • , m) move on E from Q, towards Q._, at a 7-rate equal to the 7-lcngth of 
Qi-xQi, the point P ' 0 moving at a 7-rate 14 equal to twice the 7-Iength of CM,. 
The vertices 


determine a curve which varies continuously on fi m+ , from E to a curve E' 
as t varies from 1/2 to 1. The final curve E' is in fi m , since here P' 0 = A\. The 
deformation thus defined will be called A(<)(0 £ t£ 1). Deformations of this 
type have been used by Birkhoff and Morse. 

14 The 7-rate of P. is defined as follows. The 7-length of the arc f\_, P, is a function 
h(l) of the time /. If A(£) is differentiable, then the quantity /»'(/) will be called the 7-rate 
of the point P,-. See Morse, p. 199. 


525 



GENERALIZED MINIMAX PRINCIPLE 


429 


The following lemma is immediate. 

Lemma 10.1. Under the deformation A(0(0 g t £ 1) a curve E on Sl m is de¬ 
formed continuously through curves of into a curve E' on such that J{E) ^ 
j (£'), the equality holding if and only if E = E\ that is, if and only if E is an 
extremal. Moreover, k-chains are deformed admissibly under A. 

We have the further result 

Lemma 10.2. Let |P„) be a sequence of curves in having a unique limit 
rime E. Let E' h ,E' be the images of E«, E under A(0- The curve E' is the unique 

limit curve of the sequence [E n \. 

To prove this let (9.2) be a set of vertices for the curves P». We may assume 
that these points have been chosen so as to have a unique limit set (9.1), the 
vertices of E. Let Q\ n) be the ./-mid-points of the arcs P ,P<+>» and P, B be the 
./-mid-points of QV-.Qi”- The points 


p;<-> = A u p; M . •••, PL , -\ Pli-V ■= -4. 


are the vertices of the curves E'.. It is clear that this set has a unique limit set 

(io.i) P't = 'll. p >> • p - P '-*> “ A,> 


namely, the 7 -mid-points of the arcs where Q. is the unique limit point 

of Q7°. The set (10.1) forms a set of vertices for E' and the lemma is estab¬ 
lished. . 

Lemma 10.3. Let S be a set of arcs on Q m such that the closure of S contains no 

extremal arc. There exists a positive constant d such that if E is a curve of S and 
E' its image under A(0» then 


J(E) £: J(E') + d. 


For suppose the lemma were false. Then for every positive constant d„ 
there would exist a curve E n such that its image E n under A would satisfy the 
relation 

(10.2) J<,E.) S J(E'.) + d„ (n-1,2, ..•). 


The sequence [B.\ could be modified so as to have a limit curve E. The 
sequence |£'| would then have a unique limit curve E\ the image of E under 
A by Lemma 10.2. From the relation (10.2) we could conclude that J(E) S 
J(E'), and hence that J(E) = J(E'), by Lemma 10.1. But this could be true 
only in case E is an extremal, contrary to our assumption that the closure of S 
contains no extremal arc. This proves Lemma 10.3. 

The following lemma establishes the existence of critical extremals. 

Lemma 10.4. Let {£*} be a sequence of curves on il m such that E n is the image 
of En-x under A(f). The sequence |£,| has a unique limit curve which is an 

extremal. 

I*or by Lemma 10.1 we have 


J(E -») ^ J{EJ * 0 . 


526 



430 


O. D. BIRKHOFF AND M. R. HE3TENE8 


It follows that the numbers J(E m ) have a greatest lower bound J 0 . Clearly 
J{E) = Jo for every accumulation curve E of { E n ). Moreover, E is an ex¬ 
tremal, since its image E' under A(t) is also an accumulation curve of {£„) 
by Lemma 10.2 and our choice of E n as the image of The uniqueness 

follows readily with the help of Lemma 10.3 since our critical extremals, being 
non-degenerate, are isolated. 

11. Deformations in a neighborhood of an extremal. Consider now an 
extremal E of type k. Choose the integer m and if necessary diminish the con¬ 
stant p of §9 so that 

mp < J(E) < (m + l)p. 

Let v be a positive constant so small that every arc E' in fl. at a distance d £ , 
from E is such that 


mp < J(E') < ( m + l) p . 

Divide E into m + 1 sub-arcs of equal ./-length by points />„. Through 

each of these points pass regular analytic manifolds <r,, ... , *•„ cutting E 
orthogonally. Let us denote by n the class of arcs of 12. which lie in the ,- 
neighborhood just described and which have their vertices on the manifolds 
n, • • • i We have the following 

Lemma 11.1. Let C k be a k-chain on n m such that the arcs on C k at a distance d, 
0 < 6 £ d £ v, from E are on the domain J < J(E). If the constant 6 is suf¬ 
ficiently small, the chain C k can be admissibly deformed into a k-chain C' k on V 
such that the arcs on C k at a distance d £ 6 are on the domain II, the arcs at a dis¬ 
tance d (6 ^ d ^ rj) are on J < J(E), and the arcs at a distance d £ rj are identical 
with those on C k . 

This deformation can be accomplished as follows. Let P,, ... , P m be 
respectively, the points in which an arc E' in the ^-neighborhood of E intersects 
the surfaces r,, . • • , Let P, - A x and P, +I = A 2 . If E' is at a distance 
d g *?/2 from E, we deform E as follows. As the time t varies from 0 to 1, the 
points P, move on E 1 from Pto P. at a 7-rate equal to the 7-length of the 
sub-arc P<-iP. on E', the arc P<_iPj being replaced by the elementary extremal 
Pi-xPt. If E' is at a distance d, v /2 £ d £ v , the points P\ move from P._, 
towards P, at a 7-rate equal to 2(i; - d)/ v times the 7-length of P t .,P,. The 
arcs on C k at a distance d ^ rj from E are held fast. Under this deformation 
C* is deformed into a &-chain C* having the properties described in the lemma. 

Consider now the manifolds , x m described at the beginning of this 

section. The point P< on the manifold x, is determined by a set of n — 1 par¬ 
ameters t>„, Let q = m{n - 1) and let the first n - 1 variables 

of the set u u • - • , u q be the parameters on the next n - 1 the parameters on ?r 2 
and so on. We have further 

Lemma 11.2. On the domain n the functional 7 is an analytic function f(u) 


527 



GENERALIZED MINIMAX PRINCIPLE 


431 


of the parameters u u - The. function f(u) has a non-degenerate critical 

point of type k at the point (u«) corresponding to the extremal E. 

The first statement is immediate. The second statement can be readily 
established by elementary means with the help of the positive regularity of our 
integral J. An explicit proof has been given by Morse. 14 

Let E be a critical extremal of type k. By the last lemma we see that in a 
sufficiently small neighborhood S .of E the integral J is an ordinary function 
f(u) of q variables. It can readily be seen in a manner analogous to that given 
in the paragraph preceding Lemma 3.2 that there is a A>chain C k on S having 
its boundary C k -i on the domain J < J(E) and having J = J(C k ) only at the 
arc E on C k . The cycle C*_i is non-bounding in S on J < J(E). Moreover, 
every ( k — l)-cycle on this domain is homologous to zero or to C k - X on this 
domain. A j-chain C\ on <S whose boundary is on J < J(E) will be said to be 
equivalent to C k if j = k and if the boundary of Cj is homologous to the bound¬ 
ary of C* on the domains 5 and J < J(E). 

Lemma 11.3. A j-chain C \ on S whose boundary is on the domain J < J(E) 
and which is not equivalent to C k can be admissibly deformed into a j-chain on the 
domain J < J(E) having the same boundary. A j-chain C f on S which is equiva¬ 
lent to C k and has its boundary on J < J(E) cannot be so deformed but can be de¬ 
formed into one having the same boundary and having J = J(C f ) only at the arcs 
which coincide with E. 

This result can be established by an argument like that given in the proof of 
Lemma 3.2 with the help of Lemma 7.1 of Morse, p. 169. 

12. Proofs of three statements. The statements I, II, III made at the 
end of §8 can now be established as follows. Consider any admissible j-chain 
C/. By successive application of the deformation A of §10 we may deform C\ 
so that the arcs on which J £ 6 lie in an arbitrarily small neighborhood of a 
set u> of critical extremals on which J = b. If this neighborhood is sufficiently 
small, it will consist of a finite number of non-overlapping neighborhoods each 
of which contains but one extremal and is such that J is a function /(u) as de¬ 
scribed in Lemma 11.2. Let S be any one of these neighborhoods and let E 
be the extremal interior to S. Denote the type number of E by k. By succes¬ 
sive applications of the deformation A the j-chain C, can be deformed so that the 
only arcs on C/ at which J ^ b are those lying in an arbitrarily small closed 
neighborhood S' of E interior to S. If C, is finely enough divided, there will 
be a j-chain C\ composed of all the cells of C, interior to and on the boundary 
of S and having its boundary on the domain between S and S'. According to 
Lemma 11.3 the j-chain C' can be deformed into one on the domain J < J{E) 
having the same boundary except in the case in which C f is equivalent to the 
chain C k described in the paragraph preceding Lemma 11.3. If C' is equivalent 
to C k , then C' can be deformed into a chain C” containing E, having J(C,) = 
J(E), and having the same boundary as C\. Thus we see that the whole chain 

“ Pp. 196-7. 


528 



432 


G. D. BIRKHOFF AND M. R. HESTENES 


Cj can be deformed admissibly into a /-chain in the class 97?, such that the points 
at which J = J(C,) are extremals of type j. This proves statements I and II * 

In order to prove statement III, we first note that each extremal E of type j 
has associated with it a ./-chain C t in 97?, such that E is on C, and J(E) = J(Cj). 
For, according to the remarks preceding Lemma 11.3, there is one and but one 
independent (J — l)-cycle C,-_ * in a sufficiently small neighborhood S of E 
which is on the domain J < J(E) in S, is non-bounding in this domain, and 
bounds a ./-chain C, in S containing the extremal E and having J = J(C§) only 
at E. The (j — l)-cycle C/_ » is homologous on the domain J < J(E) to zero 
or to a (j - l)-cycle C'_, in 9?,_i on this domain. Let C", be a chain bounded 
by Cj- 1 or by Cj- 1 + Cj_ t as the case may be. The ./-chain C, = C\ + C/ is 
in 97?„ contains E, and has J = J(C,) only at E. 

Consider now a set w of critical extremals on which J = b. Let E x , • • • , E m 
be the extremals of type j in w and let Cj, • • • , C7 be ./-chains of the type de¬ 
scribed in the last paragraph. These ./-chains are clearly non-equivalent. We 
shall now prove that any ./-chain C, in 97?, with J(C,) = 6 is equivalent to some 
linear combination of these m ./-chains. To do so, deform Cj admissibly, if 
necessary, so that the set of points on C, at which J = J(Cj) is composed of some 
sub-set of the extremals E lt ... , E m , say the extremals E Xt •. • , E r . Let 
St, • • • , S r be neighborhoods of these points chosen as in the last paragraph. 
Let r) be a positive constant so small that the 17 -neighborhood of E a lies in S a 
(a =* 1, • • • , r). We may suppose that C/ has been so deformed that the 
points on Cj outside these ^-neighborhoods are on the domain J < b and so that 
Cj cannot be admissibly deformed out of any one of these neighborhoods. If 
Cj is sufficiently finely divided, there will be a ./-chain 7 “ composed of all the 
cells of Cj having points in the ^-neighborhood in E a and having its boundary 
7 y_j in S a on J < b. Moreover, the chain C*, if finely enough divided, has a 
similar ./-chain C° with boundaries C% x on S a and J < b. The (j — 1/-cycles 
7 y-i, C%x are homologous on the domain J < b in S and hence bound a ./-chain 
0 “ in 5 on J < b. The chain Cj -f 77 -f 0" forms a ./-cycle 6 a , which is homol¬ 
ogous to zero on the domain J ^ 6 in 5«. Let 

Cj - C) -f ... -f C r it Sj = b) + ... + 6), 

and consider the relation 

- (C/ + C\) + (I Cj + c; 4 - 6 y ). 

The ./-chain in the last parentheses is clearly on the domain J < b. Moreover, 
by construction 6,- is homologous to zero on the domain J £ b. Hence from 
the definition of equivalence in §6 we sec that C, is equivalent to Cj, as was 
to be proved. Statement III of §8 is accordingly established. 

Harvard University. 


529 



Reprinted from Memoriae Pont. Acad. Sci. Novi Lyncaei, s. 3, Vol. 1 
1935, pp. 85-216. 


NQUVELLES REGHERCHES SUR LES SYSTEMES DYNAMIQUES ° 

par QEORGE D. BIRKHOFF 

Mlmoire couronnee par l'Acad^mic Pontificate des Sciences a 1'occasion du Concours 
a Prix « Pio XI * : Circa systemata sohttionum aequationum diffcrenttalium. 


INTRODUCTION. 


Les systemes dynamiques correspondent toujours aux systemes diffdren- 
tiels ordinaires de la forme classique, 



oil la variable ind^pendante / d^signe le « temps *, tandis que X,, X f ... X„ 
sont des fonctions donn^es des variables d^pendantes x t , x m \ toutes 

ces variables sont r^elles, et les fonctions X/ sont r^elles et ordinairement 
analytiques. En effet, dans une acceptation tres large des mots, l'&ude de 
tels systemes dynamiques n est autre chose que celle des systemes differen- 
tiels (l) dans le domaine rdel. 

Du point de vue modcrne on cherche les proprtet^s qualitatives des 
solutions x,(/), x, (/),... x„(/), en partant des Equations (i). Par consequent, 
deux systemes (i) dont I’un se deduit de 1’autre par une translormation ponc- 
tuelle des variables d^pendantes x,, x,,... x m sont regards comme equiva- 


(•) En 1930 le College de France m'a fail le grand honneur de m'inviter a faire quelques 
Conferences sons la Fondation Michonis. Grace & celte occasion si heureuse pour moi, j'ai 
repris encore une fois les questions dynamiques qui m'avaient intiressi le plus. Le titre de mes 
Conferences etait « Le dernier theorems de G/om/trie de Pot near 4. set generalisations, et set 
applications d la Dynamique a; elles ont ete faites pendant la derniere quinzainc d'avril, 1931. 
Le present Memoire contient quelques-uns des resultats que j'y enonrais et des autres que j'ai 
obtenus plus recemment. 

Serie III, Vol. 1. ? 


530 



80 


O. D. DIRKHOFF 


lents (*)• Done le groupe des transformations dc cettc espece joue un role 

fondamental. 

C’^tait Poincare qui s'cccupait le premier des proprtetes qualitatives 
des solutions. Avant lui, Lagrange, Hamilton et Jacobi, parmi d'autres, 
s’^taient occupds presque exclusivement du cote formel de la th^orie, e'est 
h dire de la resolution quantitative du systeme (i), soit par 1'emploi des 
integrates explicitement connues. soit par 1'emploi des series infinies; mais, 
exception faite de quelques problemes completement integrables comme celui 
des deux corps, leur r^sultats avaient donne peu de renseignements sur 
failure generate des solutions. 

D’abord, Poincare s'occupait du cas le plus simple, n = 2 (*). Des me* 
thodes geometriques lui permettaient de discuter les solutions dans le cas oil 
elles pouvaient Stre representees par des trajectoires tracees sur une surface 
ordinaire. Les points dtequilibre et les trajectoires ferntees correspondant 
aux solutions periodiques jouaient le role principal. 

Plus tard, Poincare vouait de longues annees k certains cas n = 3. 
A vrai dire, le probleme restreint des trois corps se reduit k un systeme 
de troisiemc ordre par 1'emploi de 1'integrale connue, premierement employee 
par Euler. Les trois volumes du ceicbre Mithodes nouvelles de la Micanique 
cileste de Poincare sont en grande partic consacres k ce probleme particuliei 
k cause de son haut intdret astronomique. A cet dgard il faut citer les 
Memoires anterieurs de Hill (•) qui avaient beaucoup stimute Poincare. 

La phi part du progris tres considerable realise depuis Poincare con* 
cerne les systemes dynamiques de deux degres de liberie. 11 faut rappeler 
ici les brillants accomplissements de Hadamard (•) et de Lcvi-Civita (*) avant 
tout; j’ai travailte beaucoup dans ce domaine moi-meme (•). 


(«) On admet aussi unc transformation de la variable indipendantte dtfinic par uue 
Equation dl = tq (x„ ...x») Tt (cp > 0 ). 

(*) Voir ses Mtmoires classiques Sur Us courbes definiet par Us equations differentietUs 
Journal de Mathimatiques. 3« serie, t. 7 (1881), t. 8 (188?). 4« serie, t. I (1885), t. 2 (1886). 

(») Researches in the Lunar Theory, American Journal of Mathematics, t. 1 (1878): On 
the Part of the Motion of the Lunar Periges Which is a Function of the Mean Motions of the 
Sun and Moon, Acta Mathematica, t. 8 (1886). 

( 4 ) Voir par exemple son Mi-moire Les surfaces & courbures opposees et teurs lignes g/o- 
desiques. Journal de Mathtmatiques, 5« series, t. 4 (1898). 

0) Voir par exemple ses Mtmoires Sopra atcuni criUri di instability, Annali di matema- 
tica, 3* s6rie, t. 5 (1900) et Sur la resolution qualitative du probUme des trois corps, Acta ma- 
ihematica. t. 30 (1906). 

(•) Voir par exemple moil livre Dynamical Systems. Ce livre sera cite simplement comme 
D. S. dans la suite. 


531 




NOl)VELLES RBCHERCHBS SOR LES SYSTEMBS DYNAMIQUES 


8? 


On pourrait demander si Ton ne peut pas etendre au cas general n > 3 
lcs resultats deji acquis dans le cas n = 3 d'origine dynamique. Une pre¬ 
miere r^ponse a cette question est la suivante: Pour le cas n = 3 seulement, 
1 'analyse nous conduit aux questions de 1 'analysis situs k deux dimensions, 
en particulier k l'egard d une transformation ponctuelle, T, d une surface 
ordinaire en elle-meme. Grace a la structure simple de telles transformations 
on peut r^aliser ici un progres considerable. D’autre part, pour le cas n > 3, 
1 ’analyse nous conduit d'une maniere semblable a une transformation ponc¬ 
tuelle a « — 1 (> 2) dimensions. Une telle transformation possede en general 
une structure enormement plus compliquee que celle d une transformation 
k deux dimensions seulement, et ce fait nous empeche d’atteindre des r6- 
sultats egalement definitifs. Ainsi je reussis a montrer dans le present Me* 
moire que, dans le cas « regulier > n = 3 d'origine dynamique, il existe une 
correspondancc biunivoque entre les entiers positifs 1, 2, 3, ... et les ele¬ 
ments d un ensemble denombrable bien ordonne et dense en lui-meme, qui 
caracterise le systeme differentiel du point de vue topologiquc; j'appelle le 
symbole correspondant la « signature » du systeme. Dans le cas n > 3 une 
caracterisation analogue devrait etre enormement plus compliquee. 

Jc me limite en general k l’etude des systemes dynamiques dc deux 
dcgres de liberte, qui nous conduiscnt au cas n = 3 le plus interessant et le 
plus difficile de tous. Neanmoins je voudrais Zaire avancer la theoric gene- 
rale dans les autres cas u — 3, et ainsi la theorie generale de tout systeme 
d’ordre n qui admet w — 3 integrales connues. Pour cette raison je donne 
des indications des extensions et modifications qui me sont venus k I’esprit. 

II y k une seconde reponse plus fondamentale k la question posee ci- 
dessus. Nous allons voir dans le cas n = 3 que le role central doit Stre 
accorde aux solutions periodiques formellement instables et a leurs families 
asymptotiques de solutions, puisque Us trajectoires corrcspondantes subdivisent 
respace de trajectoires en des petites parties. Mais dans le cas n — 2m — 1 > 3 
d origine dynamique les families asymptotiques des trajectoires situ^es dans 
S,«-i sont k m — 1 (< 2m — 2) dimensions au plus et ne peuvent pas sparer 
les unes des autres les trajectoires de $j„_i d une maniere analogue. Par con¬ 
sequent, pour crder une theorie definitive pour n > 3 il sera necessaire ou 
de decouvrir de nouvelles entites analytiques dont on peut se servir, ou 
d'employer des entites non-analytiques telles que les solutions recurrentes 
que j'ai introduces ('). La seconde de ces alternatives me parait beaucoup 
plus probable. A vrai dire, m£me dans certains cas »» = 3, comme nous le 


(•) Voir, D. S., Chap. 7. 


532 



O. D. B1RKH0KF 


88 

verrons, il semble ndcessaire d’employer les solutions rdcurrentes. Done e'est 
seulement dans le cas w = 3 que semble eln; completement valable l'opinion 
bienconnue de Poincard que les solutions pdriodiques sont « la seule breche 
par ou nous puissions essayer de pdndtrer dans une place jusqu'ici rdputee 
inabordable »• 


CHAP1TRE I. 

Les Equations pfaffiennes du quatri&me ordre. 


I. Reduction & un systfeme du troisidme ordre. 


Nous prendrons comme notre point de ddpart ldquation de variation 
suivante: 


« 




Les fonctions A/ et B qui entrent ici sont rdelles et analytiques dans les 
quatres variables rdelles x,, x,, x 3 , x 4 . De plus, nous supposons que le de¬ 
terminant anti-symdtrique 


(3) 



_ Vi, 
>*/ XT, 


n’est pas dgal A zdro dans la rdgion S 4 que nous considdrons. Les points 
l\> et P, de S 4 correspondant aux valeurs l 0 et /, de ( respectivement, sont 
regardds comme arbitraires et donnds d'avance. 

En appliquant la regie d’Euler on obtient le systeme diffdrentiel suivant 
du quatrieme ordre: 


(4) 



i 

(il ^ 


0 


(I *1,2,3, 4) . 


Ce systeme est tout a fait dquivalent a la condition 61 = 0, et ddfinit les 
extremales qui joignent P 0 a P,. Puisque A, 0, on peut toujours rdsoudre 
les dquations ( 4 ) par rapport a dr./dl, (1 = 1, 2, 3,4), ce qui donne un 
systeme de la forme ( 1 ) avec n = 4. Ces dquations • pfaffiennes » ont une 
importance toute particuliere pour la dynamique classique ('). Les dqua- 


533 


(') Voir D. S.. Chap. 2. 3. 



NOOVBLLBS RBCHERCHBS SUR LBS SYSTEMB8 DYNAMIQDES 


89 


tions dites « canoniques » correspondent au cas particular A, =x,, A f = x 4 , 
A, = A 4 = 0. 

Une propriete dvidente, mais tres fondamentale d'un tel systeme pfaffien 
est la suivante: On peut transformer les variables dependantes X/ en d'autres 
variables dependantes quelconques y i% en introduisant directement les va¬ 
riables nouvelles sous le signe de Integration dans l'intdgrale 1. Pour cette 
raison un systeme pfaffien restera du mSme type quand on fait un change- 
ment quelconque des variables dependantes. De plus, un calcul direct nous 
donne 


(5) 


A, = A, 


y«) 


» 


d’ou la conclusion que, si le determinant fonctionnel de la transformation ne 
s'evanouit pas, A, ne peut pas se rdduire A zero, k moins que A, ne se re- 
duise egalement k zero. 

Si Ton multiplie les equations ( 4 ) par dh/dl (i-1,i,3, i), on voit, en 
ajoutant, que dft/dt = 0 identiquement, c'est k dire que B — const. = 6 est 
toujours une integrale ('). Pour une valeur particuliere de 6 , liquation B = b 
est valable en general sur certaines surfaces S, k trois dimensions composees 
de io’ trajectoires; ou, par exception, sur des surfaces ordinaires S, com¬ 
posees de trajectoires, ou sur des trajectoires isolees, ou aux points d’e- 
quilibrc pour lesquels drjdl = 0, (• — 1, 2, 3, 4). 

Supposons maintenant que 1*on fasse un changement des variables de- 
pendantes, tel qu'en les variables y/, (1 = 1,2,3, 4), on a B = y 4 . Un tel chan¬ 
gement existera toujours, sauf dans le voisinage d un point d'equilibre pour 
Icquel toutes les derivees s'evanouissent k la lois. Si les coefficients de 

dy { dans 1’integrale I transformee sont designes par C/, les quatres equations 
pfaffiennes s'ecrivent de la maniere suivante, en supprimant les termes en 
iyjdl (•) : 





(» = 1,2, 3), 


(•) On peut laisser de c6t* le cas ou B est identiquement constante, parce que le systime (4) 
se reduit alors au cas tout a fait trivial, dn/di = 0 (j= 1,1, 3,4). 

(*) En effet I’ioUgrale y 4 = const., dtduite des Equations pfaffiennes completes, montrc 
qu’on a «fy 4 /</< = 


534 



90 


Q. D. BIRKHOFP 


Les trois Equations premierement Writes montrent que nous pouvons poser 



puisque le determinant A, n'est pas ^gal A z6ro. La quatrieme equation nous 
donne alors simplement Xf/A, = 1 (*). Done les equations x t , x t% x t% x t se 
reduisent au systeme de troisieme ordre en y„ y„ y„ (y 4 -= A), 

(7) Jut- _ (1,7, *) = d. *, 3). 


II faut remarquer que la condition 61 = 0 nest pas modifiee quand on 
ajoute une differentielle exacte sous le signe d’integration. Done on peut tou- 
jours supposer A 4 =» 0 dans les equations (i), ou C 4 = 0 dans les equations ( 6 ), 
sans specialisation essentielle. De cette maniere il parait inversement qu’A 
tout systeme ( 7 ) du troisieme ordre, contenant une constante arbitraire h, 
correspondra le systeme pfaflien ( 6 ) en y,, y,. y,. y. (y t — h ) avec C 4 — 0 . 

Mais si Ton modifie la variable independante I en ecrivant 


( 8 ) 


dl * J/A, dx , 


le systeme ( 7 ) prend la forme encore plus simple: 


d>/j_ _ 
dx >»/* 




(»■,;, k) = (I, 2, 3). 


Les trois lonctions A droite ici sont des fonctions Yy a ) telles que 


(9) 


. 2L. . ^L. = 0 


(•) Le determinant anti-symetrique est le carri de 

/-. V ^y, ^y* / V ^>y, ^y* / 


O'. k, /) = d. 2 . 3). 


535 


NOCVRLLBS RBCHBRCIIES SDR LES SYSTEMES DYNAMIQOBS 


01 


Inversement, £tant donnees trois fonctions Y/ qui remplissent la condition (9), 
on peut toujours trouver trois fonctions C„ C t , C, telles que ces fonctions cor¬ 
respondent aux Y/ donnees- 

On voit ainsi que le systime pfaffien g^ndral se r^duit effectivement a 
un syst^me n = 3 de la forme 

(10) ^SL — V/(j,, y,.y.) (i = 1 ,*, 3 ), 


oil les fonctions Y/ sont assujetties a la seule condition (9) dans S a . Nous ne 
supposons pas qu'on puisse employer les m&mes variables y t ,y t ,y t partout. 
Done nous concluons: 

Sauf dans le voisinage d' un point d' equilibre, tout systbne pfafjien (4) du 
quadribne ordre se reduit pour une valeur donnee b de l'intigrale B ™ 6 (voir (4)) 
h un systlme (io) du troisi'eme ordre pour lequel le volume est un invariant. 

Inver seme til tout systbne (10) du troisi'eme ordre oii les functions Y, sont 
assujetties a la condition (9) correspond a un systbne pfaffien rlduit 


b 


I 


( ty l -f C, dy, -f C, dy 9 -f- y A dl) 


0 




oil I est definie par (8). 

II nous convicnt de faire ici la remarque suivante: 

Les Equations (7) nous inontrent qu'avec la variable ind^pendante I au 
lieu de r, l'intlgrale de volume 4 trois dimensions 


f f 

est un invariant, done aussi I’integrale suivante a quatre dimensions 

En employant l identit^ (5), on en conclut que pour un systeme pfaffien (4) 
quclconque du quatrieme ordre,. l'int^grale 

ffJT* dx, dr, dr t dx t 


536 



O. D. BIRKHOFF 


92 

est un invariant, ind^pendamment des variables d^pendantes particulieres 
que l’on choisit. L'existence de cet invariant integral aussi bien que d'autres 
importantes proprtet^s des systemes pfaffiens a M ^tablte par Fdraud (')• 
Nous sommes ainsi amends k consider en premier lieu de tels sy* 
stemes r^duits ( 10 ), sans points d’^quilibre. De plus nous supposerons que 
les variables (y,,y„y,) correspondent aux points I* d un espace S, k trois 
dimensions qui est ferm£ et analytique partout. Le probleme restreint des 
trois corps (*) et beaucoup d'autres problemes importants satisfont k ces 

conditions. 


2. Reduction k la forme canonique d’un seul degr6 de liberty. 


Moyennant le groupe g^n^ral des transformations ponctuelles on peut 
ramener tout systeme differentiel (i) d’ordre n k un systeme particulier. par 
exemple k 


dt 


t , 




» 


pourvu seulement que la region consid^r^e soit choisie assez petite autour 
du point (.»/,, ...y«) donn£ et que ce point ne soit pas un point d'^quilibre (*). 
Done, dire qu'un systeme d'ordre n (= 2 m) est du type pfaffien n'a aucune 
signification locale. 

Supposons maintenant que le systeme pfaffien ( 4 ) admette dans S 4 une 
solution p^riodique pour une valeur donn^e de B = 6 . A celle-ci correspondra 
une trajectoire ferm^e dans S,, dont le voisinage complet a le caractere 
topologique du tore. En introduisant des variables convenables »/,, »/,, »/ 3 . 
dont y, est une variable angulaire de pdriode 2 n, nous pouvons faire 
correspondre cette trajectoire k l'axe de y„ de sorte que la trajectoire 
ferm<*e est donn^e par les Equations y, = y, = 0 , y 3 — /, et l’intdgrale de 


volume, 



dy t resle encore un invariant (*). Mais en employant t 


(•) On Birkhoff's Pfaffian Systems, Transactions of the American Mathematical Society, 
t. 32 (1930). 

(*) Dans le problime restreint, il faut « rdgulariser » les variables. Voir D. S., Chap. 2. 
(J) Voir, D. S., p. 56. 

( 4 ) Du point de vue giomttrique cela revient seulement a dire qu’on peut deformcr ana- 
lytiquement la trajectoire fermie en un cercle sans modifier le volume. 



NODVBLLES RBCHBRCHE8 SDR LBS 8YSTEMBS DYNAMIQDBS 


93 


comme variable ind^pendante ou dl = Y, dx , nous obtenons le systeme 
modifte 


(n) 


-^-= F(y,.y..O, -^- = G(y,»y„ t). 


parce que nous avons done essentiellement y t = t; nous avons 

£crit ici F = Y./Y, , G = Y t /Y,. L’identit^ ( 9 ) prend maintenant la forme 


(12) 


(MF) + (MG) H M 


ou M = Y,. Ce choix de la variable ind^pendante est possible dans le voisi- 
nage de l’axe de y, parce que M est 4gale 4 un le long de cet axe et 
done ne se rdduit pas & z6ro dans le voisinage. Les trois fonctions F, G, et M 
sont analytiques et p^riodiques de p^riode 2 n en t. 

Avec les variables y lt y, et t, l'invariant integral de volume est 

x)dy t dy t dx. Ce fait nous montre que l’int^grale double 
,y„t) rfy, dy t est aussi un invariant. 

Introduisons maintenant des variables nouvelles /», q pour chaque t, 
telles que 


- 1 *“" 


Avec ce choix particulier nous aurons 


■J-te-rt. 

^(y«, v.) 


t) , 


et l’invariant integral prend la forme Jf dp dq. En ces variables les Equations 
differentielles sont les deux suivantes 




ou l’on a 


IP 


1 


538 



94 


O. D. BIRKHOPP 


puisque ff dp dq est un invariant integral. II cxiste done une fonction 
H (&,&«*)• analytique et de periode in en t telle que 


F* 




G* = - 




^9 1 - 

Cette fonction H est explicitement definie par liquation suivante: 



(P.9) , 

( 0 i 0 ) (-G. dp + F* rfv) 


En ces variables definitives les Equations prennent la forme canonique sui¬ 
vante : 


( 13 ) 


= If (p - *• l) - - - 1?- <* *• 


d’ou le rdsultat suivant: 

Dans le voisinage d'une trajectoire fermee de S„ les equations diffcrentielles 
pfaffiennes (4) peuvent e/re riduites It la form*, canonique (13) d'un seul degre 
de liberU, ou H est analytique en p, q, x et periodique de periode in en r, tandis 
que la trajectoire fermie elle menu correspond a p = q = 0, 0 <t 

On obtient ces m£mes Equations canoniques en partant de liquation 
classique de variation, 


(pqd + H(p, 7, r) dx) — 0 . 

Ndanmoins ces equations ne correspondent pas precisement 4 un systime 
pfaffien du deuxieme ordre analogue 4 (4) puisque la fonction H contient la 
variable independante t. 


3. Reduction 4 une transformation ponctuelle T. 

II y a une autre reduction de notre systeme pfaffien qui remplace les 
systemes reduits (10) ou (13) par une transformation ponctuelle 4 deux di¬ 
mensions possedant un invariant integral 4 deux dimensions, e’est 4 dire par 
une transformation « conservative »; une telle reduction a ete employee pre 
mierement par Poincare (loc. cit.). Nous voulons indiquer la base bien con- 
nue de cette reduction. 


539 



NOUVELLES RBCHBKCHES SUR LBS SYSTEHES DYNAMIQUBS 95 

Pour commencer nous considdrons le voisinage immddiat d’une trajec- 
toire fermde ou Ton peut employer les Equations dans la forme canonique ( 13 ). 
Ddsignons par p 9 , q 0 les coordonndes p , 7 d'un point 1* dans le plan r = 0 dt 
l’espace (p, 7 . x). En suivant un point P le long de la trajectoire unique qui 
contient P 0 , de t = 0 jusqu’a r = on obtient un point unique correspon- 
dant P, = (/>„ 7 „ 2 n), tel que />„ 7 , se ddduisent de p„ 7 0 par une transforma¬ 
tion ponctuelle analytique, 

(14) Pi = <*> (Po. 7.). 7 — (/>., Vo) • 

Cette transformation T du plan des variables p, 7 ne change pas les aires 

puisque dp dq est un invariant integral. De plus le point (0, 0) est un 

point fixe de T dans ce plan, parce que l'axe de t correspond k la trajectoire 
fermde., Quand r croit de in k in le point P, = T(P 0 ) est transform^ en 
P, = T(P,) = T*(P 0 ). Plus gdndralement nous obtenons la suite de points 
...,P - 0 P # , P n P„...-quand t croit ou ddcroit inddfiniment. 

Cette transformation conservative, T, caractdrise completement le voisi- 
nage de la trajectoire au point de vue du groupe fondamental. En eflet 
dtant donnas deux systemes pfaffiens qui conduisent k la mdme transforma¬ 
tion T, (*) nous pouvons trouver une transformation ponctuelle qui transfor- 
mera un de ces systemes en l'autre. II faut seulement faire se correspondre 
les deux trajectoires qui passent par le mdme point P 0 = (p 9 , q 0 , 0), de fagon 
que les points ayant la memo valeur de t se correspondent. 

Afin d’dtendre la portae d'une telle reduction, nous allons en premier 
lieu ddfinir les « surfaces rdgulidres de section » dans S, de la maniere sui- 
vante: Une telle surface analytique, S„ est traversde partout dans le mdme 
sens par les trajectoires, qui sont non-tangentes sauf le long des bords (en 
nombre fini) correspondant k des trajectoires fermdes; de plus, toute trajec¬ 
toire, sauf ces trajectoires fermdes, doit couper cette surface au moins une 
fois dans tout intervalle de temps I (/ suffisamment grand, mais fixe). 

Soit maintenant P un point intdrieur quelconque d’une telle S t , et soit 
Q = T (P) le point de S f qui suit 0 sur la merae trajectoire. Ainsi se trouve 
ddfinie une transformation biunivoque T de S, en elle-mdme, qui est analy¬ 
tique a 1'intdrieur de S, et continue sur ses bords. Considdrons une petite 
rdgion da autour du point P de S, avec des coordonndes correspondantes 
y ,, y,, et la rdgion tubulaire formde par les trajectoires issues de da et pro- 

(*) Abstraction faite naturellement d’un changement des variables. 


540 



O. D. BIRKHOFF 


96 

longues jusqu’au premier point Q de rencontre avec S a . D^signons les coor- 
donndes de ce point O de rencontre par y,, y t , et par do l’aire qui correspond 
a do. Supposons que le temps croit par ill. Le volume g^n^ralis^ qui exprime 
l’invariant integral ne changera pas. Done le volume g£n£ralis£ du petit 
cylindre parcouru par do doit 6tre £gal k celui du petit cylindre parcouru 
par do, e’est a dire 

M(l») do dn =a.M (T (P )) do dn 


ou J'fo 0 dy, , o = j^Jdy, dy,, ou dn et dn d&ignent les hau¬ 

teurs des deux petits cylindres, et ou [J^\\ (V) do dn est 1’invariant integral 
exprim^ en les coordonn^es y,, y,, n. II s'ensuit que la transformation T laisst-. 
invariant I'integrate des aires, (Isdy, dy,, ou N = M dnldl est une function 


analytique, positive k l'int^rieur de S, mais z£ro le long des bords. 

Par consequent T est une transformation conservative de S, en ellc-mfime. 
II est facile d'introduire des variables s,, z t telles que I'aire gdntSralist-e de- 
vient I'aire ordinaire. Pour cela il suffit d'ticrire (par exemple) 



Notre r^sultat peut etre tinonce comme il suit: 

Soit S, une surface riguliere de section quclcowjue dans S,, c oufee par une 
trajecloire en /e point P. La transformation analytique Q = T(P), oldenuc en 
suivant la trajecloire issue du point P jusquau premier point Q de rencontre 
avec S, est necessairement conservative. 

La transformation T ainsi definie caractdrise le systeme pfaffien du point 
de vue th^orique- 


4 . Existence des surfaces ouvertes de section. 

Il y a une autre espece particulier de surface de section (*) qui possede 
un intdret considerable, a savoir une « surface ouvertc de section * S, qui 
sera d6finie de la maniere suivante: elle est ouverte et simplement connexe, 


(«) Nous considerons comme « surface de section* toute surface analytique ifermee ou 
non) qui est traverste partout dans le mtme sens par des trajectoires non-tangentes. 


541 



NOUVBLLB3 RBCHBRCHBS 8DR LBS SYSTBIdBS OYNAM1QDB9 


07 


et toule trajectoire coupe cette surface au moins une fois dans tout intervalle 
de temps / (7 suffisamment grand mais fixe). 

L'existence de telles surfaces ouvertes de section n'est pas difficile a 
d^montrer. En effet, choisissons des points P,, ... F x distribu^ns das S, tels 
que (i) tout point de S, se trouve k une distance d’un point P/ moins que 
e > 0 (e suffisamment petit), ( 2 ) dans les spheres de rayon 2e dont P,,P x 
sont les centres, la direction des trajectoires ne change pas beaucoup (’). 
Maintenant construisons les plans perpendiculaires aux trajectoires correspon- 
dantes et ces points P,,... P x . Soit a t % ... ox les regions circulates de ces 
plans k I'intdrieur des spheres correspondantes. Evidemment toute trajectoire 
traversera au moins und de ces cercles dans tout intervalle de temps T t oil T est 
une constante fixe ind^pendante de la trajectoire choisie. 



Nous pouvons maintenant remplacer ces Aments plans de surface par 
d'autres jouissant des proprtet^s analogues, et d'ailleurs sans points en 
commun. Pour cela, observons en premier lieu que les a t sont divis^s par 
des lignes analytiques et des points d’intersection en des parties que Ton 
peut rendre simplement connexes, par des coupures. En faisant se retirer 
un peu tous ces parties et en ajoutant de petits rubans ou dements de 
surface dans le voisinage de chaque ligne ou point d’ intersection, on obtient 
sans difficult^ un nombre fini des Aments simplement connexes sans aucun 


(') Nous supposons comme toujours que le syst*me diffiretitiel n'adniet pas des points 
d'tquilibre dans S,. 


542 


O. D. BIRKHOFF 


08 

point en commun, qui ne sont pas tangents k une trajectoire et qui sont 
coupes par toute trajectoire dans un intervalle de temps assez grand. 
Soit £*^X le nombre des Aments de surface a A ainsi obtenus. 

Nous pouvons maintenant remplacer ces X* Aments par X* — 1 lldments 
de surface jouissant des mftmes propri£t£s. Supposons en premier lieu qu une 
trajectoire coupe deux ^tements aA et a,* en les points successifs P/ et P/. 
On peut alors ^tendre un ruban de surface, 6 , du bord de oA jusqu'au bord 
de o,+ de sorte que b est partout pr6s ou de oA ou de la trajectoire P/P> ou 
de ay* (voir la figure ci-jointe), sans que ce ruban devienne tangent k au- 
cune trajectoire. L’dllment nouveau at* + b Q ui cst simplement connexe 
peut remplacer les deux Aments of et «/, pourvu qu'on le modifie un peu 
de fagon k le rendre analytique- 

En continuant ainsi on r£duira le nombre des lldments de surface fina- 
lenient k un seul dement. Autrement on obtiendrait X** > 1 Aments a A* 
de surface, tels que toute trajectoire qui passe par un point P/ d'une telle 
surface oA* ne contient aucun point P> d’une autre surface ay**. Mais cela 
est impossible; en effet, dans le cas contraire les regions tubulaires form^es 
par les arcs de trajectoires avec leurs buts dans les ^ldments de surface di- 
viseraient S, en des parties sans aucun point en commun. 

De celte manitre on peut toujours construire une surface ouverte de section S f , 
analytique, partout simplement connexe , et nulle part tangente & une trajectoire, 
telle que toute trajectoire coupe S g au moins une fois dans tout intervalle de 
temps L assez grand. 

En employant une telle surface S„ on peut d^finir une transformation 
Q a T(P) comme auparavant. Cette transformation sera biunivoque et analy¬ 
tique pour tout point P k l'intdrieur de S„ sauf dans le voisinage imm^diat 
des courbes analytiques le long desquelles la premiere image Q de P se 
trouve sur le bord de S,. Sur ces courbes analytiques exceptionnelles T sera 
en g£n£ral discontinue. 

On peut aussi construire une telle surface ouverte de section au moyen 
d’une surface r4guli&re de section- Un exemple simple suffira pour expliquer 
la m&hode de construction. Prenons le cas oil la surface r^guliere de sec¬ 
tion S, est un tore (du point de vue topologique) avec deux bords A, B cor- 
respondant k deux trajectoires ferm^es dans S, qui sont d^crites dans le 
mfime sens. 

Pour en obtenir une surface ouverte de section, nous commengons par 
faire deux coupures dans S, afin de rendre le tore simplement connexe. 
Nous prenons la premiere coupure comme une coupure ferm^e et puis nous 
modifions S, un peu le long d'un de ses bords de fagon que ces bords soient 


543 



NoOVBLLES RBCHBRCHB8 8tJR LBS SYSTBMB8 l»YNAMlQCB8 


99 


entierement distincts et que toute trajectoire dans le voisinage coupe S, 
{voir la figure 2a ci-jointe). La deuxieme coupure sera ouverte et nous pou- 
vons modifier S, encore une fois dune maniere analogue {voir la figure 2b). 



La surface ainsi modifide, S,*, sera plane au point de vue topologique. Nous 
supposerons que les trajectoires p^netrent S t * d’au-dessous dans la figure 3 
correspondante. Maintenant nous enlevons deux petits rubans de surface le 
long de A et de B {voir la figure en bas) et puis nous faisons deux coupures 



ouvertes qui s'^tendent respectivement sur les bords modifies A* et B* 
jusqu’au bord ext^rieur C, en modifiant S,* d’une mani&re analogue le long 
de ces coupures (voir la mfime figure). Evidemment la surface S,** ainsi ob- 
tenue est simplement connexe et est travers^e d'au-dessous par toutes les 
trajectoires dans un intervalle de temps suflisamment grand, avec la seule 
exception des trajectoires voisines de A et B. En ajoutant k S g ** deux petits 
morceaux simplement connexes de surface qui coupent A et B respective¬ 
ment dans la maniere indiqu£e dans la figure, nous obtenons une surface 
ouverte de section. 


544 





100 


O. D. BIRKHOPF 


5 . Un crit£rium pour l’existence des surfaces rlgullferes de section. 

Une condition n^cessaire et suffisant pour I'existence d'une surface r6- 
gultere de section est la suivante: Consid^rons tous les types topologiques 
des surfaces analytiques S, dans S 3 qui ont seulement des trajectoires fermties 
aux bords. Si une de ces surfaces est coup^ au moins une fois de plus 
dans le sens positif que dans le sens n^gatif par toute trajectoire (*) dans 
tout intervalle de temps l assez grand, une surface ferm^e de section 
existera. Malheureusement cette condition est completement impraticable. 

Un nouveau crit^rium tres important est la suivant: 

Pour r exisle rue cf une surface regulibre de section il est necessaire (sinon 
suffisant) que pour toute paire S x et S", des surfaces simplement connexes de 
section (*) 1 arc 0'0" d'une trajectoire quelcorujue qui joint Q' de S', a Q" de S", 
reste cf une longueur bornee quand 0 et Q" varien (continuement) dans S', et S" 
respectivement sans depasser leurs frontieres. 

Pour le d^montrer supposons qu-il existe une surface r^guli&re de sec¬ 
tion, S, avec des bords qui correspondent aux trajectoires fermdes A, B,... K. 



Quelques-unes de ces trajectoires peuvent couper S', en un point au moins; 

disons que A, B,... la coupent en A,,... A*.B.B/,... respectivement; et 

marquons ces points sur S',, lmaginons que tous ces points soient entour^s 
par de petits cerctes g£od£siques de S', {voir la figure 4 ). La region de S', 
en dehors de ces cercles, dont la frontiere est compost de ces cercles, 


(•) Sauf les trajectoires ferm6es qui correspondent aux bords 
(*) Les surfaces S,' et S," peuvent m«mf Ctre identiques. 


545 







NODVELLES RBCHERCHBS SDR LBS SYSTEMBS DYNAMIQUB9 101 

et de la frontiere de S'„ ne contient aucun des points A/, By, ... K r des 
bords de S g . 

Consid^rons, maintenant, un point quelconque 0' de cette region S' t . II 
existera certainement un premier point P de S„ ant£rieur a Q' et sur la 
mfime trajectoire, et un premier point post^rieur P, = T(P), selon la propri&4 
fondamentale d'une surface r£guli£re de section. Quand P et Q' varient d’une 
facon continue mais arbitrairement dans S, et S', respectivement, le point 0' 
ne peut pas sortir de l'arc PP, de la trajectoire sans que 0' coincide avec 
I* ou P,. Cela n’arrive que le long des lignes d'intersection de S et S', en 
nombre fini {voir la figure au-dessus). La longueur de PQ' restera toujours 
born^e pendant cette variation, mime si 0' sort de cette maniere, de l’in- 
t^rieur de P P t , de P, P„ ou P_, P,..., en traversant les lignes d'intersection. 
De plus la relation entre P de S, et 0' de S', sera analytique. Cependant il 
peut bien arriver qu’A un seul point P de S, correspond plus d’un seul point 
0' de S' t . Quand P se trouve dans le voisinage d’un des bords de S„ le 
point 0* se trouve A l’int^rieur de l'un des petits cercles autour de A/, By,..., 
et inversement. 

Par consequent nous pouvons representer S', sur S, en employant des 
feuillets analogues A ceux d'une surface de Riemann. La longueur de PQ' 
restera bornee si 0' traverse les lignes d’intersection un nombre fini de fois. 

Au lieu de supposer que P soit le premier point anterieur A Q nous 
pourrions aussi bien supposer que P soit le deuxteme point anterieur, ou 
un autre point de S, situ6 sur la meme trajectoire que 0, et la mime con¬ 
clusion subsistera encore. 

Ce raisonnement nous montre ^galement que la longueur de l’arc d’une 
trajectoire PQ" reste bornee quand Q" varie dans aucune autre surface sim- 
plement connexe de section S", d’une maniere analogue. 

II s’ensuit done que la longueur de l’arc Q'Q" reste bornee aussi, pourvu 
que les lignes d’intersection de S', et S'', avec S, ne soient travers£es qu’un 
nombre fini de fois. Mais quand Q' parcourt une petite courbe ferm^e, il est 
Evident que l’arc Q'Q" ne change pas en longueur. Done l’arc restera born£ 
quand Q' et Q" varient librement dans S', et S", respectivement sans d^pas- 
ser les frontieres de ces regions, comme nous voulons d^montrer. 

Quand ce crit^rium n^cessaire se trouve satisfait, il est possible au moins 
dans beaucoup de cas d 'eUndre les surfaces ouvertes de section en r^unis- 
sant deux de ces surfaces en une seule. En efifet soit S', et S", deux surfaces 
de cette espece, et Q' et Q" deux points de S', et S", respectivement qui 
sont situ£s sur la mime trajectoire. Laissons Q' et Q" varier librement sans 
traverser les frontieres de S', et S", respectivement. Il peut arriver que Q' 
couvre une region simplement connexe de S', dont la frontiere consiste en 

Serle III, Vol. I. 8 


546 




102 


O. D. BIRKHOPF 


des parties de la frontiere de S', et des arcs int^rieurs qui correspondent a 
des parties de la frontiere de S",, et aussi que la region’de S", couverte par 
O" ait des propri£t£s analogues, et corresponde d'une mantere biunivoque 
et analytique 4 la region couverte par Q\ 

En laissant tout point de S', se mouvoir le long de sa trajectoire on peut 
done faire avancer les points dans cette partie simplement connexe de S', 
jusqu'aux points correspondants de S"„ en faisant avancer convenablement 
les autres points de S', aussi. Par cette methode, S', et S", se trouvent r^u- 
nies dans une seule surface ouverte de section. 

Quand le crit^rium ci-dessus n'est pas satisfait (e'est 4 dire que la lon¬ 
gueur d un arc Q'Q" d’une trajectoire peut devenir infinie quand deux points 
correspondants 0' et se meuvent dans deux surfaces de section S', e S", 
respectivement), je crois que le systeme differentiel correspondant est en 
gtindral d’une structure oeaucoup plus compliqule que dans le cas contraire. 


6. Un exemple. 

Comme illustration particuliere, consid^rons le systeme du troisteme 
ordre en x, y % z, 


dS) 


— /( r )y * •) ^ 

^ Y(x,y,z) , 

dz 

- d T “ 9 (0 — * (*. . 


ou r f = x* -f- y* et oil tous les points (x, y, z -f ikn) t (i«0,±l,±8,...) sont 
regard^s comme identiques. Evidemment le syst&me sera r^gulier pour >o 
si f{r) et g (r) sont analytiques en r* pour r :o . 

En introduisant les coordonn^es polaires, r, 0, correspondant aux coor- 
donn^es rectangulaires x , y, on trouve 

(■s') ! = "• 

d’ou 1’on obtient la solution g^n^rale, 

r = c » $ * “ f{c) * + d , t = g (<*) / + e . 


547 



NOCVELLBS RECHERCHBS SOR LBS SYSTEMRS DYNAMIQOBS 


103 


Pour voir ce qui arrive k l'infini faisons le changement des variables 
w = 1 /to dans le plan w = x -f if, done r' = 1/r, &' = — 0, et ecrivons z' = z. 
Les Equations restent encore de la m£me forme qu’auparavant: 


dS_ 

<// 





ce qui nous montre que les Equations (15) seront r^gulieres partout et sans 
point d’^quilibre si /(r) et g(r) sont analvtiques en r* k l'infini, et ne s'eva- 
nouissent pas simultan^ment. 

II s’ensuit done que 1'espace S, correspondant k ses points en corres- 
pondance biunivoque et analytique avec les couples (P, 0), oil P est un point 
quelconque du plan de la variable complexe et 0 est un point quelconque 
d un cercle. Done S, peut £tre represente par la region entre deux spheres 
concentriques, pourvu que chacun des deux points des deux spheres sur le 
m&me rayon soit regarde com me identique k l’autre. 


De plus, l'intdgrale ^ \ j M (r) 


dx dy dz est invariante puisqu'on a 


~ (MX) + (MV) + (MZ) - 0 ; 


en particulier 1'integrale suivante est invariante: 



dx d y dz 
1 +r 4 



r dr dQ dz 

>+r‘ 


Mais en introduisant les variables r\ d\ x' dans cette integrate elle reste 
preeminent de la mfime forme. On conclut done que cette integrate de vo¬ 
lume est invariante partout dans S,. Par consequent nous voyons que les 
equations (15) sont essentiellement de la lorme (10), caracteristique d’un 
systeme pfaffien reduit. 

Cherchons maintenant un systeme pfaffien correspondant. Pour l’obtenir 
nous determinons en premier lieu trois fonctions A, B, C de x, y , x (*) tel- 
les que 


->B 


-5C 


*=f(r)y , 


^x 




= -f{r)x 




l>k 

>3f 



(•) Ce sont les fonctions analogues A C t . C,. C, plus haul tandis que z, y, x remplaceot 
y 4 , y t . y, respectivement. 


548 


o. D. B1RKH0FF 


Si nous substituons dans ces Equations, 


A = -£- £ r rj(.)rfr , B = -^- J rj(r)rfr , C = - £ r/ (,) dr , 

elles se trouvent satisfaites. La solution la plus generate s'obtient en ajou- 
tant k ces valeurs de A, U, C des termes >p/*r, **>/** ■*>/** respectivement, 
ou <p est une fonction arbitraire de x, y, x. Mais nous pouvons negliger ces 
termes puisqu’ils ne changent rien dans le systeme pfaffien correspondant 
qui se deduit de 1 '^quation de variation, 


6 1 (A dz + Brfy + Crf:+ D dw + wdl) = 0 


La fonction D ici est k choisir. Mais un calcul eiementaire montre 


que nous avons 




et aussi que les Equations pfaffiennes se reduisent k la forme 

1 —£-»« <—““>■ 

qui correspondent k ( 7 ). L'invariant integral aura la forme ddsirde si 


En choisissant 


|/A — (1 + f 4 )-*. 




«i + V>0+' 4 > ’ 


cette relation se trouve satisfaite. Par consequent les Equations ( 15 ) se de- 
duisent de liquation de variation, 

6 f IttC 9 (0 <*' (yd* — *dy) — ( f rf{r) dr dz + 

( 16 ) 

— Arz -4- /d dw , . ~1 . 

*j(r) + //M l+r« + wd ‘J = 0 ’ 


-i ~wdt \ = 


549 



NODVELLBS RBCHBRCHES 80R LB8 3YSTBMBS DYNAMIQOBS 


105 


qui sera valable partout si kg(r) -f //(r) 0 pour des valeurs particulieres 

de k et /. 

Passons maintenant k la question int^ressante de l'existence des sur¬ 
faces rygulieres de section. Si le functions f et g s’yvanouissent k la fois, ce 
qui donne lieu aux points d’yquilibre correspondants, de telles surfaces ne 
peuvent pas exister selon leur definition. En effet un tel point ne peut £tre 
situe ni k l'intdrieur d'une S, hypothytique, ni sur l'un de ses bords, ni hors 
de S t : dans le premier cas la trajectoire correspondante (qui n’est qu'un 
point) ne traverserait pas S,; dans le deuxieme cas le bord correspondant 
ne consisterait pas en une trajectoire ferm^e; dans le troisieme cas le point 
d’yquilibre correspondrait k une trajectoire qui ne traverse jamais S t . 

II y a deux cas oil l'existence de telles surfaces est immediate, k savoir 
le cas f=fc 0 et le cas g =£ 0. En effet si f ^ 0, Tangle 0 croit ou d^crolt 
constainment, et 0 = 0 (c'est k dire y = 0, 0, 0 ^ *<£ 2n) repr^sente une 

telle surface de section dont les deux bords correspondent aux deux trajec- 
toires ferm^es r = 0 et r=-o. La surface de section S, est en ce cas une 
region annulaire. On voit d'une manure analogue que pour g ^ 0, une sur¬ 
face ferm^e de section est le plan i =* 0, et que cette surface de section 
est sans bord et topologiquement yquivalente k une sphere. 

Plus g^n^ralement, si kg If ne se r^duit nulle part k ztro (*, l ytant 
des entiers particuliers sans facteur en common) nous voyons que kx — /d = 0 
est liquation d'une surface r^guliere de section S t . En ce cas la surface S, 
aura les trajectoires ferm^es r = 0 et r = oo comme les seuls bords de multi¬ 
plicity /- II semble tout a fait naturel d'admettre de telles surfaces rygulteres 
de section g^n^ralis^es, en n'excluant pas les bords multiples. 

Comme j'ai remarquy autrefois (*), l'existence des surfaces rygulieres de 
section indiquent toujours l’existence des variables angulaires croissantes ou 
dycroissantes, k savoir kx — /d en ce cas oil kg If 0. 

II est extremement curieux que liquation de variation (16) reste valable 
partout sous precisement la menu condition. Cela me conduit k pryvoir une 
rylation trys ytroite entre les surfaces rygulieres de section et l'existence 
d’une yquation de variation valable partout. 

Pour traiter le cas gyndral, cherchons d’ytudier la nature gynyrale d’une 
telle surface S„ s’il en existe. Chaque point P a son intyrieur doit £tre un 
point pour lequel dr ne se ryduit pas k ziro dans toutes les directions; au- 
trement $, sera tangente k la trajectoire correspondante en ce point. II s'en- 
suit que Tintyrieur de S, est entierement rempli ou par des courbes analy- 


(•) Voir D. S., Chap. 


550 




Q. D. BIRKHOKF 


106 

tiqoes r = const, ferm^es sans point double, ou par des arcs analytiques 
sans points doubles qui s’etendent <i ses bords. Ces bords eux-mfimes cor¬ 
respondent 4 un nombre fini de valeurs exccptionelles de r. 

II parait aussi que toute valeur de r se trouve r^alis^e au moins une 
fois dans S, selon la definition mSme de S f . D'autre part, si une telle valeur 
se trouve r^alis^e sur plus d'une seule courbe fermee ou ensemble connexe 
de telles courbes et des arcs, il existerait certainement d’ autres points pour 
lesquels on aurait dr = 0, ce qui n'est pas possible. 

On voit ainsi qu'une telle S g doit £tre de genre zero et aura une struc¬ 
ture topologique de l'espece indiqu^e dans la figure ci-jointe. 



Plus exactement nous pouvons avoir des valeurs exceptionelles r„ r t , ...r m 
de r ou 

0 ^ r i < r t ••.<»•« ^ OO , 

pour lesquelles il y a au moins un bord correspondant. Puisque la surface 
S t est fermee dans S„ les variables 0 et : doivent augmenter par ikn et 2 /n 
respectivement (k, l etant des entiers) le long de toute courbe fermee 
r = const, qui ne correspond pas a une telle valeur exceptionnelle. Ces va¬ 
leurs dc k et / ne peuvent pas changer sauf quand r passe par une de ses 
valeurs exceptionnelles, comme montrent les considerations elementaires de 
continuitd. 

De plus on doit avoir kg(r) + //(r) ^ 0 pour r non exceptionnelle. En effet 
pour »• = r* ou r* est non-exceptionnelle on a dz/dd = — g(r*)ff(r*) en tout 
point d'une trajectoire correspondante, tandis que 0 et z croissent par 2A*« et 
2 /n respectivement quand on fait decrire au point P la courbe r = r* sur S t . 
Si kg(r*) //(»■*) = 0 , S, serait tangente a la trajectoire correspondante dans 
au moins un point, ce qui contredirait la definition de S t . 

II parait aussi que, si k' et i sont des valeurs de Ar et / pour une region 
adjacente exterieure, on doit avoir k'g{r) -f l'f(r) z/z 0 dans cette region. Mais 
si Ton fait decrire au point P les deux courbes limites correspondant aux 


551 



NOOVELLBS RECHERCHES SUR LES SYSTKMES DYNAMIQUBS 


107 


deux regions, on voit que » et i croit par «(*' — k)x et par 2(/' — /)« res- 
pectivement quand on lait d^crire les courbes de bords. Supposons qu'il y 
ait a de ces courbes et que pour chacune d’elles d et g croissent par 2x* 
et 2 Xji respectivement. On obtient done les relations k' — k => ax, f — / = «X, 

D'autre part les bords correspondent aux trajectoires ferm^es, done on 
doit avoir *g(n) + X/(r,) = 0 si r, d^signe la valeur exceptionnelle de r sur ce 
bord. On voit ainsi que la quantity kg(r) + lf[r) varie continuement et rcste 
d’un seul signe partout si I on indique par k et / les entiers correspon- 
dants qui en g£n£ral changent brusquement quand r passe par une des va- 
leurs exceptionnelles. 

Done, pour construire une surface r^guliere de section pour des fonctions 
donn^es f{r) et $(r), il faut choisir une fonction discontinue «p(r) — k(r)/l(r) 
telle que (i) les entiers *(r),/(r) sautent de kj k k‘ C quand r d^passent cer- 
taines valeurs exceptionnelles r,... r m de r pour lesquelles 

(*' - k)g (O) -4- (/'-/)/>,) = 0, 

et (2) la fonction continue k{r)g(r) + l(r)f[r) reste d’un seul signe. Cela est 
toujours possible. 

II n'y a de difficult* que celle de detail de construire une surface 
rdguliire de section ayant dcs bords correspondants k une telle fonction. 
L’examen d un cas particulier suffira pour indiquer la mdthode de construc¬ 
tion de la surface correspondante. Prenons par exemple 

Choisissons r «= r, = I comme la seule valeur exceptionnelle, avec * — 1 , / — 1 
pour r<l, et avec k= !,/= — ! pour r>l. Notre surface r*guliere de 



552 




108 


O. D. BIRKHOPF 


section sera la surface — kz -f- /d = 0 sauf pour une modification tegere voi- 
sine de r — 1 . Done la surface est donnde par z — 0 = 0 pour r < 1 e et 
par : -f ft = 0 pour r > 1 e. Puisque 0 croit avec / pour rCl et »• > 1 
respectivement, il est Evident qu'on puisse faire une deformation de la partie 
exterieure au cylindre r = 1 qui la modifie en un ruban compose de parties 
verticales et horizontales comme dans la figure ci-jointe- Le fait essentiel 
ici est que cette modification nous donne une surface pour r 1 qui> est 
traversee partout dans le m£me sens par les trajectoires. De la m&me ma- 
niere on peut modifier la surface pour r < 1 comme 1‘indique la meme fi¬ 
gure. En reunissant les deux surfaces ainsi obtenue dans une seule surface, on 
obtient une surface r^guliere de section a peu pres. A vrai dire, cette sur¬ 
face aura ses bord le long de trois trajectoires ferm^es, k savoir r = 0, r = x 
et r' = 1, x = 0 (ou 2n) et elle sera traversee partout dans le meme sens 
par les trajectoires, au moins une fois dans tout intervalle de temps assez 
grand. Par une modification de plus on obtient une surface analytique avec 
ces m&mes propridt^s qui est done une surface rdguliere de section. 

L’existence des surfaces r^gulieres de section pour n’importe quel choix 
de f{r) et g(r) me conduit k croire qu'il existe une telle surface dans tout 
cas intdgrable sans point d’^quilibre. 


7. Une application. 

Tandis que je n’ai pas r^ussi jusqu'ici, de d^montrer la conjecture que 
je viens d’<inoncer, j'ai ddmontr^ les faits suivants: 

Si un sy slime dynamique conUnant un paramltre analvlique p devienl in • 
Ugrable pour p = p 0 , et si pour p = p, U systlme dynamique tie poss'edc pas 
des points d'equilibre , il admet une surface reguliere de section ou au moins il 
satis fait au criterium access air e donne ci-dess us. Si une telle surface existe 
pour p = p 0 elle existera aussi poor p — p 0 suffisamment petit. 

Commensons par le cas integrable p = p 0 sans point d’dquilibre. Je dis 
que le criterium de section (5) doit Stre satisfait. Pour le faire voir, suppo- 
sons que S', et S", soient des surfaces de section, pour lesquelles le crite¬ 
rium n’a pas lieu. Les surfaces de trajectoires correspondant aux valeurs 
particulieres de la constante dans 1'int^grale connue sont ferm^es et doivent 
couper S', et S", le long de certaines courbes analytiques. D autre part ces 
surfaces de trajectoires doivent etre topologiquement ^quivalentes ou a des 
tores ou k des lignes ferm^es puisqu'il n'existe pas des points d dquilibre 
pour p — p 0 ; il est bien connu que toute distribution continue des directions 


553 



NOUVELLBS RBCHERCHBS 90R LBS SYSTEMBS DYNAMIQUBS 


100 


sur une surface ferm^e autre que le tore admet au moins un point d'inde- 
termination. 

De cette maniere il parait que si P* et P" sont des points quelconques 
de S', et S'' t respectivement situes sur la mime trajectoire, la surface de 
trajectoires qui contient P' et P" doit varier d une maniere continue avec P / 
ou P". Mais cela ne suffit pas pour nous demontrer que 1 ’arc P' P" reste fini. 
Neanmoins nous pouvons raisonner de la maniere suivante. Soient 0 , <p, et \p 
trois coordomuies dont b et <p sont des coordonn^es angulaires qui sp^cifient 
la position sur une surface de trajectoires, tandis que specific la surface 


particuliere (*). Soit JJtt (0, <p. db dtp d\p l’nvariant integral correspondant; 
la fonction M est positive, analytique et periodique de pdriode 2 n en b et en <p. 

Puisque Pintegrale connue est \p — const., il s’ensuit que (b, <p, \p) db drp 

est un invariant integral pour chaque valeur de H’- 

D'autre part les Equations differentielles sont de la forme 


db 

dt 


— F (0, \p) 


dtp 

dt 


f. (d. <p, yp) 


ou les fonctions F et G sont analytiques et periodiques de p^riode 2 * en 0 
et cp. Par consequent l'idcntitd 

<* IF ) + ( MG > - 0 


doit avoir lieu. 

Done en dcrivant dt = Mdx, les equations differentielles prennent la forme 

-4jL- r.* (d. *), 


ou F* et (i* sont analytiques et periodiques de periode 2n en 0 et <p, et oil 
I'on a 


Db + Dip 


0 . 


(•) Nous supposons ici que les surfaces consid*r*es ne se reduisent pas 4 une seule tra- 
jecloirc, cas qui ne demande que des modifications non essentielles. 


554 



110 


G. D. BIRKIIOFK 


Mais cette derniere identite montre qu'il existe une fonction II ( 0 , q>, U 0 > aj 13 ' 
lytique en 0 et <p et p^riodique de periode 2* en ces variables, telle qu on 
pourrait ^crire 

+ C*-^ + <* (*) . 

ou a et p sont analytiques en h». Done les Equations ^crites ci-dessus ad- 
mettent I’intdgrale 

H ( 0 , <p,v) + u WHP 0*0 <P = const. 

Puisque F* et <** ne s'^vanouissent pas simultan^ment, les deux d^riv^es 
partielles de la fonction a la gauche dans cette Equation ne peuvent pas 
s’t*vanouir a la fois. Cela nous montre aussi que a (i|») et P (>p) ne peuvent 
pas s’^vanouir a la fois, parce que autrement les deux derives partielles 
devraient s’tivanouir au point ( 0 , q>) ou H(d, <p, atteint son maximum. 

II s’ensuit maintenant de cette derniere Equation que non seulement la 
surface de trajectoires varie d’une maniere continue avec le point V' de S', 
mais aussi la constante 4 la droite dans cette Equation varie ainsi puisque 
9 et q> varient d’une maniere continue avec I 1 '. Par consequent le point 1 *" 
de S", doit varier ainsi en mtime temps, et la longueur de l arc PT" restera 
fini. Done notre crit^rium se trouve satisfait pour le cas int^grable p = p 0 
mftme s’il n’existe pas une surface r^guliere de section. 

Les m^thodes connues de la continuation analytique nous permettent 
d’affirmer de plus que si une surface r^guliere de section existe pour p p 0 
(avec des bords correspondant aux trajectoires periodiques s/mp/es) % une 
telle surface existera aussi pour p — p, petit. 

II me semble extremement probable qu’on pent construire par des 
m^thodes analogues une surface r^guliere de section dans tout cas integrable 
moyennant les surfaces ouvertes de section. 

8 . Remarques g£n£rales. 

Signalons brievement la portee assez generate de ce qui precede. En 
partant d’un systeme pfaffien d ordre n = 2m, et en employant l’int^grale 
connue (voir liquation (2) pour le cas n = 4 ) comme une des variables 66 - 
pendantes, par exemple x im , l iquation de variation prend la forme 




NOUVELLES RECHERCHES SUR LES SYSTEMBS DYNAMIQUES 


111 


avec l'integrale rx, = const. Done on peut suppriiner cette variable, ce qui 
donne plus simplement 

a—. 

6 I 2 X.dxi = 0 . 

X - 

Mais 1'expression sous le signe d'integration est homogene de dimensions 
1 en dT n ...dx 2m -i , done la valeur de l’int^grale ne depend pas du para- 
metre que Ton choisit. En prenant xt~-i = t comme parametre par exemple, 
nous obtenons liquation de variation reduite. 


r* T ‘ / 

6 J ( X,dx.-f X*.-. dx 



qui nous donne un systeme pfaffien dans une forme etendue oil la variable 
independante apparait dans les fonctions X/. 

D'une maniere analogue, si k integrates ind^pendantes sont connues, le 
systeme pfafften se reduit k I’ordre n — 4 — 1 si k est impaire et k l’ordre 
n — k si k est paire. II ne faut que prendre les integrates connues comme 
des variables ddpendantes dans liquation de variation pour r^aliser cette 

reduction. 

Quant k une reduction a la forme canonique, cela est toujours formelle- 
ment possible, mais cette reduction n’a pas ete effectivement realisee sauf- 
dans le cas n = i. Malgre ('importance des resultats obtenus, surtout par 
Poincare et Levi-Civita, avec l’aide des variables canoniques, la signification 
thdorique de telles variables au-dete de celle de tout autre systeme pfaffien 
parait £tre incertainc. 

Presque tous les resultats que nous avons obtenus k regard de la re¬ 
duction du systeme pfaffien pour n = i k une transformation T s'etendent 
immediatement au cas n = 2m > 4. Dans l'espace S2—i des trajectoires nous 
partons d’une surface S2—2 quelconque qui est traversee partout dans le 
m£me sens par ces trajectoires. En choisissant la variable angulaire t telle 
que r = 0, ± ... correspondant a >2—2 , on deduit immediatement de 

liquation fondamentale de variation que la transformation T satisfait a liqua¬ 
tion suivante: 


2 2 2 [X, (*,•••, . . . *wa c, \ 0) dr/ 1 * — X. (r,,. . . r to _ 2 , 0) dx,} = dy 


556 



112 


0. D. B1RKHOFF 


ou x t ...Tim-2 sont les coordonndes d’un point quelconque de S2—2, et ou 
... *2-.-2 ,,> sont * es coordonndes du point transforme de S2—2, obtenues 
en suivant la trajectoire correspondante de t = 0 i t = 2 n. La fonction <p qui 
parait a droite represente ici la valeur de l’integrale 1 le long de 1 arc de 
cette trajectoire. Done la transformation T est une transformation de contact, 
jouissant de toutes les proprietes bien connues d'une transformation. 

La m&me m^thode g^om^trique valable pour « = im — 4 suffit ici pour 
construire une surface ouverte de section S2—2, avec les deux proprietes 
suivantes: (1) S 2 «-2 est analytique et n’est jamais tangente a une trajectoire; 
(2) toute trajectoire coupe S2—2 dans le m£me sens, au moins une fois dans 
tout intervalle de temps suffisamment grand- 

Dans le cas n = ini = 4 les surfaces regulieres de section S t possedent 
seulement des bords formas par des trajectoires pdriodiques isoiecs. Pour 
ti = ini >i de tels bords doivent etre remplac^s par des families analytiques 
nvariances des trajectoires a 2 m — 3 dimensions. Par cxemple. dans le cas 
du probleme restreint des trois corps 011 le corps infinitesimal ne se meut 
pas dans le plan des deux.corps finis, la famille invariante des trajectoires 
ou le corps infinitesimal se meut dans ce plan constitue une telle surface 
jfermee de section S 4 . Mais dans le cas general les trajectoires periodiques 
sont isolees et il ne semble pas existcr de telles families invariantes; done 
il faut renoncer a l’espoir d’obtenir des surfaces regulieres de section S2-.-2 
avec des bords ti tin — 3 dimensions ('). 

Il est aussi a remarquer que les rdsultats generaux ainsi obtenus sur 
l’existence de telles surfaces regulieres de section sont egalement valables 
pour les systemes differcntiels non d'origine dynamyque. Seulement, nous 
ne pouvons pas dire en ce cas sans un examen preiiminaire qu'un systeme 
qui est analytique dans le parametre \i et « integrable » pour ** = n 0 satisfait 
au criteriucn necessaire pour n — p 0 suffisamment petit. 

Les systemes qui ne possedent pas des telles surfaces regulieres de 
section S2—2 sont certainement plus compliquds en general que les systemes 
qui en admettent. On pourrait meme classifier les systemes qui n’en admet- 
tent pas scion le nombre et caractere des discontinues de T pour une sur¬ 
face de section quelconque. 

A vrai dire, la reduction d un systeme differentiel a une transformation 
ponctuelle T constitue une reduction qu'on peut faire generalement et qui 
doit etre consideree comme fondamentale du point de vue theorique, meme 
si une surface reguliere de section n’existe pas- 


(') A moins que les families analytiques invariantes de trajectoires existent en g£n£ral. 


557 


NOUVBLLES RBCHERCHES SDR LB SYSTEMBS DYNAMIQDES 


113 


CHAP1TRE II. 

Les solutions <?-p6riodiques et les solutions voislnes. 

x. Classification g£n£rale des solutions pgrlodlques. 

Moyennant la reduction k une transformation T on peut classifier les 
solutions p^riodiques. Dans un long Mdmoire public dans les Acta Math* - 
matica (‘), j'ai entrepris syst^matiquement cette classification (*). L'^tude lo¬ 
cale que j’y ai faite m'a permis de discuter jusqu'& un certain point tout cas 
possible sauf le cas extremement d^g^nere d’une rotation formelle par un 
angle incommensurable avec c'est k dire, le cas oil T se r^duit k une 
telle rotation en employant des variables convenables donn^es par des s£ 
ries convergentes ou divergentes. En ce cas et en ce cas settlement une 
puissance formelle de T se r^duit k la transformation identique. A cause de. 
l’importance intrinseque des proprtet^s locales pour la th^orie g^n^rale, je 
vais pousser cette £tude plus loin ici, tout en rappelant les m^thodes et les 
rdsultats de mon M^moire. 

Commencer par £crire T dans la forme suivante, 

(1) u K —f(u, v) — au + bv + ... % v — g(u % v) — cu + dv+...(ad — bc>Q) t 

oil I'origine est le point fixe de T qui correspond k la solution pdriodique 
que nous voulons ^tudier. Puisque T est conservative, il y a un invariant 

integral, JjQ (", r) du dv, avec 0 > 0 r^elle et analytique. De ce fait on 

d£duit imm^diatement qu’on a 

( 2 ) ad — be = 1 


(') Surface transformations and their Dynamical Applications. Acta Mathematic*, t. 43 
(1922). Ce MAmoire sera cit* comme S. T. plus tard. 

(*) Les recherches de Morse sur les solutions piriodiques doivent etre mentionies ici. 
Voir ses Minneapolis Colloquium Lectures qui viennent de paraltre avec le titre: The Cal¬ 
culus of Variations in the Large. Morse s’occupe des solutions piriodiques surtout au point 
de vue du Calcul de Variations. 


558 


O. D. BIRKHOFF 


114 _ 

cn employant I'identite 

(3) 0(“' b ) = 0 C"’')^m 
& l’origine. 

II existe toujours (') une fonction 

(4) F* (“, *0 = V u * 4 - 9 UV + + • • • • 

donn^e par une serie infinie en u, v (convergente ou divergente) telle que 
les deux Equations difT^rentielles purement formelles 

du 1 ~*K* dv _ J_ ^F* 

< 5 ) dk ^ Q ~*v ' dk = Q -ju * 


definissent T 4 pour toute valeur de k de la maniere suivante: Ecrivons la so¬ 
lution formelle en serie des puissances de k de fapon que les deux series 
pour ut, et v* se rdduisent & u 9 et i> # respectivement pour k — 0 . Alors ( u k , vm) 
exprimera les coordonn^es transform^es de (m # , t> # ) par T\ quelle que soit 
la valeur de k. Par exemple, si T est donn^e tout simplement par 


u i= -qu , 



(P> 1) 


on aura F* =* — uv log p, done 

du u . dv . 

Ik - - log e , Jj-- VU \og „ , 

et par consequent 

u — « 0 Q k — “0 0 + k e 4 • • •) . v = v 9 Q k = t> 0 (1 — k log p 4 . . .) . 


Les equations (5) nous montrent que la fonction F* est invariante par T. 
11 existera aussi des courbes formelles qui contiennent l'origine, 


“ “ °« 1 4“ °* l% + • • • » *> =*= Pi * 4 - 0 * ** + ••• 


(•) Voir S. T., p. 1-13. Dans le cas special oik les racines de liquation caracttnslique 
q* — (o + <0 o + 1 = 0 sont des k* racines de I'milt, il faut commenrer par la puissance T k 
au lieu de T. 


559 



NOUVELLBS RBCHBRCHBS SDR LBS SYSTEMKS DYNAMIQUBS 115 

et qui sont invariantes par la transformation T, c'est a dire telles qu’on a 
identiquement: 

n, = /, + a, /,* 4“ • • •» v t = ■+■ P* *i* + • • •« 

f a -tf + 4 f + ...(t^0)) . 

En effet ce sont pr^cis^ment les courbes pour lesquelles F* (u, v) = 0 du 
point de vue forme!. Remarquons en passant que l’ensemble de courbes in¬ 
variantes formelles (avec leur multiplicity) determine la fonction F* k un 
facteur G pr^s, ou la s^rie G contient une terme constante (^ 0 ). 

Nous distinguons gynyralement entre le cas « hyperbolique » ou il existe 
au moins une courbe invariante r^dle (c'est k dire telle qu'on pourrait pren¬ 
dre a,,a t ,.., P„P,.... et c, </,... rdelles), et le cas « elliptique » ou de telles 
courbes niellos n'existent pas. Nous appelons les solutions p^riodiques cor- 
respondantes du type hyperbolique ou elliptique respcctivement ou, plus 
bri^vement A-pyriodiques ou e-pyriodiques. 

Cette bifurcation est absolument fondamentale du point de vue th^ori- 
que. On peut expliquer en peu de mots pourquoi il y a une difference qua¬ 
litative entre les deux cas- 

Dans le cas hyperbolique les courbes belles invariantes correspondent 
a des courbes hypercontinues (') invariantes composes des points qui ten- 
dent asymptotiquement vers le point fixe par T* ou par T k (k =■ 1,2, 3,...). 
11 peut arriver que quelques unes de ces courbes sont analytiques et mtime 
composes des points invariants- Tous les points dans le voisinage qui ne 
se trouvcnt pas sur une de ces courbes invariantes actuelles s'approchent 
de I'origine et puis ils s'yioignent de ce point quand la transformation T esl 
indyfiniment rypytye. On voit done qu'une solution A-pyriodique est toujours 
instable. 

Dans le cas elliptique, au contraire, les points ryityryes suivent k peu pres 
des courbes fermyes autour de l'origine presque indyfiniment. En ce cas la 
solution pyriodique sera pour ainsi dire, presque stable, et ses coordonnyes 
seront exprimyes avec grande exactitude pendant longtemps par des syries 
trigonomytriques, en gynyral divergentes; ces « courbes * k peu pr£s inva¬ 
riantes sont de la forme gynyrale F** = const., oil la fonction F** est une 


(•) J’appelle «hypercontinue» one courbc r^elle r*guli*re n = <p(x), c = y(x) oil q>0=^(0) = 0 
mais + telle que <p (t), h»(x) «> nl analytiques pour x > 0 et sont continue* 

avec toutes leurs dtrivie? pour x = 0. 


560 



Q. D. B1RKH0FF 


116 

fonction analytique a l’origine dont le developpement en serie coincide avec 
celui de F* jusqu’aux termes de degre tres eleve. 

N^anmoins nous distinguons ici aussi entre le cas instable et le cas 
stable- Dans le cas instable (par definition) tout voisinage o (aussi petit qu'on 
le veut) du point fixe sera transforme par T, T*.... en des voisinages qui 
sortiront finalement d’un voisinage s donne d'avance. En ce cas une des 
images successives de o obtenues en repetant indefiniment la transforma¬ 
tion inverse T*\ doit sortir de * aussi. Autrement, l’ensemble des voisi¬ 
nages (‘), 


S-o + T-(o) + T* (o) + ... 

sera transforme par T"* en 

T * (°) + T** («)+-••» 

e’est k dire ou en S ou en une partie de S. Mais la demure possibilite est 
exclue puisque T est conservative, done on a T* 1 (S) — S. Cela nous montre 
que la region simplement connexe autour de l'origine, S*. qui est occlue par 
S est egalement invariante par T* 1 et done par T aussi. On voit ainsi que 
si o est choisie assez petit pour etre a I'interieur de S*. toutes ses images 
v resteraient aussi, ce qui contredit 1 ’hypothese d'instabilite. Cette fa$on de 
raisonner nous conduit immediatement au criterium simple de Poincare: 

Pour qu’un point fixe d'une transformation conservative, T, soit stable, 
il faut et il suffit qu'il existe un nombre infini des voisinages ouverts, sim¬ 
plement connexes autour de ce point, de diametres arbitrairement petits, 
qui sont invariantes par T. 

Selon cette definition toute solution A-periodique est instable. Les solu¬ 
tions *-periodiques peuvent etre ou instables (ei-periodiques) ou stables 
(fs-pdriodiques). 


2. Les solutions a et w asymptotiques d’une solution *i-periodique. 

Nous allons considerer en premier lieu le cas « general * d’un point 
elliptique instable. Dans ce cas nous aurons pr — iq* > 0, e’est a dire les 
trois termes du second degre dans la serie F* correspondent k une forme 


(') En considerant a et done S comme des regions ouvertes, nous ajoutons 6ventuellement 
les points intirieurs occlus de S a la region S comme elle est dtfinie par cette Equation. 


561 



NODVELLES RECHBRCHBS SDR LBS SYSTEMBS DYNAMIQDES 


117 


quadratique d^finie. En ce cas une transformation formelle convenable des 
variables r^duira T k la forme normale (*). 

I u. = u cos (9 + cr ,m ) — v sin (9 + er*"») 

( 6 ) 

/ v t = u sin (d 4- cr*”') -f- v cos (9 -f- cr*»") 
ou, dans le cas d£g£nere cit£. 


( 6 ') 


u t = u cos 9 — v sin 9 
r ( = u sin 9 4- v cos 0 



Dans les deux cas (6) et (6'), 9J9n n'est pas un nombre rationnel et nous 
pouvons supposer aussi qu'on ait 0 < 9 < 2x. Les series invariantes corre- 
spondantes son respectivement 


F* 


iL i \ 

2 ' 4 (m -f" I)/ 



Mais une transformation lin^aire u — ku, v = kv rlduira la const ante c en (6) 
&db1. Done il n'y a qu'un seul invariant icul'angle d (mod 2n). Remarquons 
aussi que pour T\ la constante c est remplacde par — <•; done nous pouvons 
toujours supposer que c soit positive. 



(•) Dans ces iquitations on a m = 1 en giniral ; et. si une condition est satisfaite, on a m _ 2; 
et ainsi de suite. C’cst seulement quand un nombre infini des conditions a lieu qu’on tombe 
sur le cas d*g*nfcre. 

Sene III, Vol. I. 9 


562 





118 


O. D. BIRKHOFF 


Comrne je l'ai montrl (loc. cit.), la propria suivante joue un r 61 e tout 
& fait fondamental dans la thlorie des points elliptiques g£n<*raux: Dans le 
cas elliptique g^n^ral non-d^g^nere (m fini, O 0) toute direction radiale dans 
le voisinage du point fixe 0 (en des coordonn^es convenables) est tournee 
vers la gauche de cette direction (voir la figure ci-jointe). Autrement dit, 
Tangle de rotation autour de 0 croit avec la distance du point fixe. 

Je vais donner tres brievement quelques indications de la maniere dans 
laquelle cette proprtet^ se montre importante. 

Consid^rons un cercle a de petit rayon q autour de 0 comme centre. Par 
('iteration ind^finie de la transformation T, on obtient les images successives 

T (o), T* (a) .Soit T- (o) la premiere de ces images qui sort d’un cercle 

fixe, s, donn^ d’avance; puisque le point O est instable, un tel entier n exi- 
stera toujours. Naturellement, l entier n depend de q et croit vers 1 'infini 
quand q tend vers z 6 ro. 

Soit P, l’image d’un point P de la circonf<*rence de o tel que P. se 
trouve en dehors du cercle fixe s. Consid^rons la ligne radiale OP et ses 
images successives OP 4 , OP,.... OP,. A cause de la propridtc* tfnonc^e, toutes 
ses images seront partout tournees vers la gauche de la direction radiale; 



en particular la partie 00 , de OP, ou Q, est le premier point de OP, sur 
la circonference de s le scrait. Done dans le plan de r, ip = tan*‘ v/u consi- 
d<*r^es comme des coordonn^es rectangulaires (voir la figure 8) nous trou- 
vons un arc OQ, tourn£ vers la gauche de la direction radiale qui joint l’axe 
de <p k la ligne C qui correspond k la circonference de s . Le point 0 - sera le 
point de cet arc le plus k la gauche, tandis que le point sur la ligne <p = 0 
est le plus & la droite. L’^cart angulaire entre ce point et Q, croit ind^fini- 
ment quand q tend vers z^ro, comme montre une dtude de (i). Toutes les 
n images successives de OQ, par T' 1 restent k I'int^rieur de la bande entre 
C et cp — 0 et sont aussi tournees vers la gauche de la direction verticale. 

En faisant q tendre vers z£ro et done n vers 1 'infini on conclut qu il 
existe n^cessairement un ensemble ferm6 connexe, qui s'^tend d’un point 0 
de C jusqu’i la ligne cp = 0 , tel que toutes ces images de par T ‘ se trouvent 


563 




NOOVELLES RBCHERCHBS SUR LES SYSTEMBS DYNAMIQUES 119 

eompletement a l’intdrieur de la bande. Per consequent il existe un ensem¬ 
ble ferme connexe particular compose de tous les points P tels que P, T‘(P), 
T'*(P), ..., se trouvant dans le cercle s , qui sont connexes a 0 par de tels 
points. C'est cet ensemble que nous allons designer par 2a. L'ensemble 2« 
lie peut occlure aucune aire autour de 0; autrement cette aire formerait une 
region invariante A l'intdrieur du cercle donne, et une telle region ne peut 
pas exister dans le cas instable que nous considerons. 

Un raisonnement immediat nous montre que 2a est lui-meme tournd 
vers la gauche de la direction radiale dans le sens suivant; on peut attein 
dre tout point D hors de 2a en suivant un arc All {voir la figure 8) qui 
part d'un point A du cercle C et tourne partout vers la gauche de la direc¬ 
tion radiale sans se croiser. En effet les points inaccessibles par un tel arc 
sont transformes par T' 1 en des points inaccessibles, et par consequent les 
aires inaccessibles seraient transformees en une partie d'elles-m6mes, ce qui 
n’est pas possible. 

Ddmontrons maintenant que les ensembles St,T' i (Si),T‘ , (Si) l ... ten- 
dent asymptotiquement vers le point fixe 0 d une maniere uniforme. Autre 
ment il existe des valeurs telles que lim n, — et telles que 

T ", (2a) setendent en dehors d un cercle fixe r = r # . Done il doit exister des 
sous-ensembles de T ", (2.) qui sont ferm^s, connexes A 0 et qui contiennent 
au moins un point sur r = r # . Un tel sous-ensemble 2«' et ses images T* (2 a '), 
(k ^ W/), sont situds entierement dans le cercle r r # . En faisant «/ tendre 
vers l'infini, nous obtiendrions ainsi un ensemble 2««, ferm£ et connexe A 
0, qui s'dtend de 0 a un point de r = r # , dont toutes les images (par T ou T*‘) 
se trouvent a l’intdrieur de s. De plus, tout point hors de 2 a « devrait Gtrc 
accessible de l'extdrieur, soit de la gauche, soit de la droite de la direction 
radiale. On conclut que 2.« doit £tre compost des segments radiaux. Mais 
cela est impossible. En effet T(2«u») n'aurait pas la meme propridt^, puisque 
T tourne toute direction radiale vers la gauche, et n^anmoins la m£me ma¬ 
nure de raisonner montre que T(2«.) doit l'avoir. 

Finalement nous observons que l’ensemble 2* s’dtend d'un point Q de 
la ligne C ind^finiment vers la droite, pour la raison suivante. II ;st Evident 
que la partie du plan au-dessus de 2 a et a la droite de 0 forme un sous- 
ensemble 2**, ferm£ et connexe a 0 qui, de plus, est situ^e entierement A 
la droite de 0 (voir la figure 8). Si l’4cart angulaire de 2 a * est fini, l'dcart 
de T**(2«*) croitrait indefiniment (') avec k tandis que tous ces ensembles 


(') En consequence de la forme normale particulierc (6) de la transformation T. Cette 
transformation normale est presque la mfme que la transformation T en des coordonnis con- 
venables. 


564 



120 


O. D. niRKIIOFF 


sont sitii^s dans l'int^rieur de C. Done 2 »* et T 4 ( 2 **) devraient se croiser, 
et occluraient une aire des points connexes a O, et cette aire tendrait uni- 
form^ir.ent vers O. Cela est impossible puisque T est conservative. 

On peut raisonner de la meme manicre sur la transformation inverse T'\ 

Disons maintenant qu‘un point est un point u (cd) quand toutes ses ima¬ 
ges par T'* (T) restent a l'intlrieur du voisinage donn£ »<» 0 . Nous pouvons 
formuler nos conclusions comme il suit: 

Dans tout cas elliptique non-degeture instable il existe deux ensembles fer¬ 
ine* connexes 2. et 2« composes des points a et to respectivement, qui sont res- 
pectivement tournls vers la gauche et vers la droite de la direction radiate, et 
qui s'efendent du bord du voisinage c hoi si jusquau point fixe, en tour nan l in- 
definiment autour du point fixe vers la droite et vers la gauche respectivement 
en sens angulaire (*)• 

11 est Evident que les mesures superficielles extdrieures de 2« et de 2u* 
sont, z£ro. 

Les r^sultats ainsi resumes aussi bien que les nouveaux rdsultats de ce 
Chapitre se traduisent imm^diatement en des resultats correspondants con- 
cernant les solutions ei-p^riodiques non-d£g£neres et leurs solutions u et <•» 
asymptotiques. 

Nous allons maintenant dtablir d'autres faits concernant les ensembles 
2« et 2«. 

Disons qu’un point I* de 1 ‘ensemble 2 « est k la « distance a asympto* 
tique », r, du point fixe O si P, aussi bien que ses images par T‘‘, ne sont 
pas toutes connexes k O par des sous-ensemblcs fermds de 2« situds a une 
distance moins que r — e (e > 0) de O, tandis que I* et toutes ses images par 
r* sont connexes a 0 par de tels sous -ensembles de 2« a une distance 
de 0 qui est egalc i r ou plus petite. Un tel r < r 0 existera toujours. 

Remarquons en premier lieu que si I* est a une distance « asymptotique y 
du point fixe 0, il doit appartenir a I'enseinble 2« <c> ddfini p.ar rapport au 
cercle r = Q. Pour ^tablir ce fait, nous appelons 2 0 . 2 „ 2 t ... les sous-ensembles 
fermds de 2«, qui joint les points l\ I’.,, F.„... respectivement a 0, chacun 
de ces points <£tant a 1’intdrieur du cercle r = q. Consid^rons 1’ensemble 
ferm 4 T( 2 ,) qui est connexe k 0 . Puisque 2 # et T(2,) n'inclut pas une aire, 
ils doivent avoir une partie fermtfe 2' en commun qui est connexe a 0 et 
qui contient le point I*. Cette partie 2' peut remplacer 2 0 tandis que T' 1 (2) 
peut remplacer 2 ,. Par un raisonnement analogue on trouve un ensemble 2", 
ferm£, connexe a 0 et contenant P, tel que 2 „ 2 , et 2 t peuvent etre rem- 


(•) Voir S. T., Chap. 3. 


565 




NOUVELLBS RBCHBRCHBS SUR LES SYSTBMBS DYNAMIQOBS 


121 


places par 2 "; T*‘ ( 2 ”) et T"* ( 2 ") respectivement. De cette fagon nous defi 
nissons une suite infinie des ensembles 2', 2",... dont 1 ensemble limite 2* 
est ferm<i, connexe k 0 , et contient P, telle que 2 * T*‘ ( 2 *), T'*( 2 ),... sont 
toutes k une distance moins que q de 0 . Par consequent P appartiendra k 
Pensemble 2 « lC> ’ comme nous 1 'avons dit. 

Nous pouvons maintenant d^montrer deux r^sultats plus precis concer- 
nant 2 « et 2 *.. Le premier, qui reflete Papproche des ensembles asympto- 
tiques 2« et 2« au point fixe, s'enonce comme il suit: 

Dans le menu cas il exisle une fonction positive decroissante de r, y* (r), 
avec (0) — + co, telle qiu (i ) un point quelconque de 2» situe a une distance u 
asymptotique r du point jixe 0 sera tony ours transforme en des points h une 
distance a asymptotique moins qiu s (* <*r) apris n>« (O ~ (0 iterations deT' 

ou s' « s) est assez petit pour qiu (s') > if* (*)» et (2) oil il exisle un nombre 
i/tjini de paires des distances r, s entre les membres slucessijs d une suite de¬ 
croissante. 


P,» Q 


lim Q n = 0 , 


el des points correspondents P, pour lesquels au moins H»« (O — ( r ) iterations 

d‘ T ‘ sont necessaires avant que la distaiue a asymptohquer du point P soil 
trans forme en mu distance moins qiu s. 

Il cxiste aussi une foiution assoc He analogue U»-( r ), qu’on obtient en consi¬ 
der ant la transformation T au lieu de T*‘. 

Pour ddmontrer ceci nous allons construire une telle lonction >p«(r). 
Soit m le nombre derations de T * qui sont necessaires pour que Pen- 
semble 2 . 10i> reiatil au cercle r — q, se trouve constamment transforme en 
des ensembles situes k Pinterieur de r = e,+i. Selon ce qui precede, de 
tels entiers n/ existent toujours. Pour 0'fi< r ^P' definissons (r) comme 
e^alc a n t -f- n # + ... + «/. Nous aurons alors pour r quelconque et *<r 


(*) — V* ( r ) = »*/+» •••+■/« (O — ( r ) ^ + • • • + n J 


avec 

P/+-1 < r ^ Qi » *' P/-H <*£& (•£))• 

Soit maintenant P un point a une distance a asymptotique r de 0 ; done P ap* 
partiendrait k Pensemble 2 *'®'’ defini par rapport au voisinafce comme 

nous venons de le voir. 

Apres iterations de T*‘, le point P sera transforme en un point P W/ k une 
distance u asymptotique de 0 au plus Q/+ 1; apres >i.+, iterations de plus cette 
distance sera diminuee jusqu'a e.+2 au moins; et ainsi de suite. Done apres 


566 




122 


O. D. BIRKHOFK 


^ a (s') — M'( r ) iterations de T*‘ la distance asymptotique ne peut pas exceder 
g,- + , < s. Far consequent la premiere propriete enoncee est vraie. 

Four demontrer la deuxieme propriete nous n’avons qu’a rcmarquer que 
selon la definition in^me de la fonction ^«(r), il existe un point I* k une 
distance a asymptotique r de 0 avec p,>i < r ^ p, tel que n/ iterations de T ‘ 
sont necessaires avant que le point P se trouve transforme en un point k une 
distance a asymptotique p,>i au plus de 0 . En effet on aura 

♦i, = (i ) — (r) pour Qi+2 < *'<^ <?/+i • 

et par consequent au moins (*') — Vfrif) iterations sont necessaires pour 
que la distance asymptotique r de P k 0 soit diminuee k i, ce qui demontre 
la seconde partie de notre enonce. 

Ces fonctions i|»«(r) et caracterisent l’approche d’un point de ou 
de k 0 . On demontre sans ditficuke que dans le cas elliptique, elles 
croissent plus rapidement que toute puissance negative de r ('), en s’appuyant 
sur le fait que la fonction r* w* v * (en des coordonees convenables m, t>) 
est invariante per V iteration de T*' ou T jusqu '4 un terme d’ordre » (p un 
entier positif quelconque). 

Dans notre second resultat, un peu analogue, il s’agit des points voisins 
des ensembles et : 

Dans le mcme cas il existe une fonction positive croissante, x«(r), avec 
(0) a 0, et une fonction positive decroissante X« (r) avec h, (0) = zo lelles que 

(1) tout point h une distance moins que r du point fixe 0 res/e a une distance 
moins que x«(r) de l’ensemble pendant au moins hx (r) iterations de T et 

(2) il existe un tel point qui est transforme en dehors de la region don nee 
aprls Xa (r) iterations de T. 

Il existe aussi deux fonctions analogues x« (r) et X«(r). 

Nous allons definir ces deux fonctions. En premier lieu choisissons une 
valeur positive quelconque r 0 de r et considerons les images successives 
par T du cercle r = r 0 . 11 existe certainement un entier n,, le moindre pos¬ 
sible, tel que, apres n t iterations de T, un point P de ce cercle sort du cercle 
donne. De plus toutes ces images avant la derniere se trouvent k une cer- 
taine distance, au plus s , de Nous prendrons x« (r 0 ) comme egal k s. 
Par sa definition cette fonction croit avec r. 

Demontrons maintenant que x« (r) tend vers 0 avec r. Au cas contraire 
nous, trouverions une suite decroissante des valeurs de r, 

__ r,, r,, ... (lim r* = 0), 

(*) Voir D. S., Chap. 4. 


567 



NOOVELLBS RECHERCHES SDR LES SYSTEMES DYNAMIQOES 123 

et une suite croissante correspondante des valeurs de n, 

n„ n tt ... (lim n* = oo), 

tel que le cercle r = r* reste a l’interieur de la region donnee pendant n* ite¬ 
rations de T mais sort au deli d’une distance positive fixe (d > 0 ) de 1 'en- 
semble 2a au moins une fois. 

Supposons done qu’apres m* iterations de T(m*<^»»4) un point P„, A se 
trouve sur la frontiere de cet d voisinage de 2a. II existe alors une ligne 
radiale 01 * k l’interieur du cercle r = r * dont toutes les images OP,,... 0 P % _ 1 
se trouvent k l’interieur du d voisinage de 2« tandis que OP m ^ s'etend jusqu’au 
point P„ 4 de cette frontiere. En laissant k croitre, r* tendra vers zero, et m k 
aussi bien que n* tendront vers I'infini. Par consequent, en passant k la limite, 
nous obtenons un ensemble 2a*, ferme, connexe avec 0, qui s’etend jusqu'i 
la frontiere de d voisinage de 2., et qui reste k l’interieur de la region 
donnee pendant Iteration indefinie de T 1 . Mais en ces cas 2 «* appartiendrait 
necessairement a 2 a, ce qui est impossible. Done on conclut que Ha (r) tend 
vers 0 avec r. 

De cette fagon nous associons au cercle de rayon r autour du point fixe 
un xa (r), voisinage correspondant de 2a tel que (i) toutes les images de 
ce cercle par T restent dans cet x« (r) voisinage de 2 a pendant n(r) iterations 
de T et, en outre, (a) il existe un point P de ce cercle qui se trouve trans- 
forme en dehors de la region donnee apres n (r) iterations de T. Ainsi en 
definissant Xa (r) comme egal k n (r), nous obtenons des fonctions x a (r),Xa(r), 
ayant les deux proprietes demandees. 

Ce resultat nous indique d'une manure precise que les points voisins 
du point fixe 0 restent dans un certain petit voisinage de 2a pendant beau- 
coup d' iterations successives de T, jusqu’au moment ou un point P de ce 
voisinage de 0 se trouve-transforme en dehors de la region donnee. 


3. La transformation de 2 a et 2 « par T. 

Pour penetrer plus loin considerons d'abord les points de 2 a qui sont 
accessibles (') de 1 ’exterieur de 2 a . Ceux-ci sont ranges en ordre cyclique au¬ 
tour du point fixe O- II nous sera commode de transformer la region exte* 
rieure en l'interieur d'un cercle T de rayon l moyennant une transformation 


(•) Le long d’un arc de Jordan. 


568 


124 


O. D. UIRKIIOKK 


analytique « = A(=) (= = " + »>)- En ce cas les points accessibles de 2. cor¬ 
respondent aux points partout denses de T ('). Plus exactement, un poin 
accessible, joint a un voisinage rattachy k l'arc dc Jordan qui le rend acces¬ 
sible correspondra a un point de T; done le meme point gcomytrique pour- 
rait apparaitre plus d'une seule fois. Remarquons aussi que le point fixe 
mfime pourrait fctre accessible de cette maniere. En ce cas il existera (dans 
le plan ou r, tp sont repr^sentees comme des coordonnces rectangulaircs) 
un arc inctefini de Jordan qui s'approche de l’axe de <c sans avoir aucun 
point en commun avec I'ensemble 2,. Nous ytudierons cette possibility plus 
de pres dans la section suivante. 

II peut arriver que 1*image T * (2.) de 2., qui est toujours une part.e 
de 2« pour A >0, est composee entierement des points inaccessibles de 
pour k suffisamment grand Mettons ce cas de cot6. 

La transformation T ne change pas lordre cyclique des points accessi¬ 
bles de 2, qui appartiennent k T 1 v2«). ni par consequent l ordre des points 
de T correspondants. Par cette transformation 2. est transform^* en 1 en¬ 
semble T(2«) qui contient 2. comme sous-ensemble actuel. Cela revient 
seulement k dire que T 1 (2.) est un sous-ensemble de 2«. On voit done 
qU e T se trouve divisy en des points et des intervalles qui sont transformes 
par T en des points sans modification de leur ordre cyclique. Par conse¬ 
quent si <p d^signe la coordonnee angulaire d'un point queiconque de T, et 
c p t dysigne la coordonnye du point translormy. on aura «P, = «P 4 - f('p) oij 
,p ^(cp) cs t une fonction non-dycroissante ct f{<p) est une fonction continue 
et pyriodique de pyriode 2n. 

Remarquons qu'un point queiconque de T qui ne correspond pas au 
point fixe est finalement transforme par la ^petition de T en un point qui 
n’appartient pas a 2*; autrement les images successives de tous les points 
de 2* par T“‘ ne pourraient s’approcher du point fixe uniformyment comme 
ils le font. II s'ensuit que tous les points de T qui correspondent a un point 
accessible autre que le point fixe, se trouvent a l'interieur d un intervalle 
qui se ryduit a un point apres un nombre fini d’iterations de 1 . 

Disons maintenant que deux points I* et Q de T sont du « meme type » si 
les images successives de l'arc I’Q par T se ryduit finalement k un seul point. 
Evidemment si I* et Q aussi bien que Q et R sont du meme type, I* et R le 
sont aussi. Le cercle T se trouve ainsi divisy en des points et des intervalles 


(I) Nous nous nppuyons ici stir des rtsultats bicn connus. A vrai dire, on pourrait 6viter 
facilement I'emploi tic tels resultats en considtrant le bord dc — a directemcnt. 


569 




NOOVBLLBS RBCHERCHES SDR LBS SYSTEMES DYNAM1QUES 


125 


tels que chaque intervalle correspond aux points d'un seul type- De plus, 
la transformation T transformer ces points et ces intervalles de T en des 
points et des intervalles correspondantes, et ne changera pas l'ordre cyclique 
de ces 414 ments. Done, en faisant correspondre par similitude les points in- 
dividuels des intervalles correspondants, on obtient une tranformation T* 
de r en soi-m£me qui est biunivoque et continue. Nous appelerons le co¬ 
efficient de rotation correspondant « le coefficient de rotation * de l’ensemble 
2 . par T (')• 

Tous les points de T ne peuvent pas se rdduire a un seul intervalle 
parce que cela n arrive que dans le cas exclu ou, pour une valeur finie de 
k, T fc ( 2 *) ne contient aucun point accessible de 2 a. Done P est transform^ 
en lui-m&me par n'importe quelle puissance de T, et il existera toujours au 
moins deux Aments (points et intervalles). Far consequent le coefficient de 
rotation est toujours bien ddfini relatif k la region donnee. 

En outre, ce coefficient o. ne depend pas de la region choisie autour du 
point fixe puisque, selon notre hypothese, tout T * ( 2 a) contient des points 
accessibles de 2, qui pour k assez grand sont aussi des points accessibles 
de 2-.* pris relatif k une region arbitrage plus petite. 

Remarquons aussi que le coefficient analogue pour la transformation 
T (A; quelconque) par rapport k la meme region sera toujours k a.. En effet 
pour k > 0 I’ensemble 2a ,M correspondant contiendra 1'ensemble 2a et en 
particulier un des points 0 de 2* sur la circonterence du cercle donne. Un 
tel point Q sera k la fois un point accessible de 2 . et 2 a , *‘. Mais le coeffi¬ 
cient de rotation se trouve entierement determine par l’ordre cyclique entre 
un seul point et ses images. Done le coefficient de 1 * doit etre dgal k k o*. 

Nous pouvons maintenant demontrer les resultats suivants: 

Dans le cas elliptique instable non-degenere, au 2 a ( 2 J) se trouve trans¬ 
forme en une partie de lui-meme par T * (T 4 ) qui est entierement compost de 
points inaccessibles de 2a (2«) pour k suffisamment grand, ou les points accessibles 
de 2« (2w) sont ranges en ordre cyclique autour du point fixe avec un coefficient 
de rotation egal a la constante 0 de la forme nor male (6). 

Nous pouvons supposer que la premiere alternative n’ait pas lieu pour 
2 » par exemple, done que T' k ( 2 a) contiennc des points accessibles de 2 a pour 
tout k > 0. 


(•) On trouvera une exposition des proprietes clcmentaires du coefficient de rotation dans 
S. T., section 45. Ces coefficients etaient difinis par Poincare dans la qUAtritnie partie de son 
Mimoire deji cite. 


570 



o. D. BIRKHOFF 


126 

Si I’indgalite o* ;> 6 navait pas lieu, le coefficient de rotation o« de 2« de- 
vrait dtre moins que 6. Nous pourrions alors trouver deux entiers k, / (A: > 0) 
tels que 

k& > 2/n > ko* . 

Considerons alors la transformation composde, T*, obtenue de la mamere 
suivante: nous faisons preincrement la transformation T 4 avec coefficient de 
rotation ko* rclatif k 2*; puis nous faisons une revolution ordinaire du plan 
par un angle — tin . 11 est evident apres ce qui precede que la transforma¬ 
tion T* ainsi ddfinie aura un coefficient de rotation ko* — tin = — e < 0 sur 
I'ensemble 2» ,4) correspondant, tandis que le long de <p — 0 dans le plan des 
coordonnees rectangulaires r, <p, la rotation correspondante sera 

A-d_2/« = /> 0. 

Soil 0 un point de 2«‘ 4> qui se trouve sur le ccrcle r = r 0 . Comme nous 
Tavons ddji vu, il existera alors un sous-ensemble fermd connexe 2«*' 4 ' de 
2a‘ 4 * qui contient 0 et qui s’dtend inddfiniment k la droite en s'approchant 
de r = 0 (non ndcessairement uniformdment). Par la transformation 1* ce 
point 0 sera transformd en un point T*(Q) plus a la gauche que le point 
T* (It), ou R ddsigne le point sur <*> = 0 directement au-dessous de 0 ( voir 
la figure ci-jointe) et T*(R) est lui-meme a la gauche de H. II faut se sout 
venir ici du fait que la transformation T et done T*(Ar>0) font tourner tou- 
direction radiale vers la gauche; et aussi du fait que 2.*‘*> et T* (2«*' 4 *) sont 



tournds vers la gauche de la direction radiale. Par consdquent, si 1 ensemble 
T*(2«***») s'etendait au-dessus de 2»* ,4> vers la droite sans couper QR, l’aire 
finie inclue par l’axe de <p, QR et S.*' 4 * serait toute situde a l intdrieur de 
I’aire de son image, ce qui est impossible. 

D'autre part, si T* (2«*‘ 4> ) s'dtendait au-dessus de 2**‘ 4 » vers la droite 
tout en coupant QR au moins une fois, nous raisonnons de la maniere sui- 


571 



NOUVELLES RBCiiBRCIIBS SOR LES SYSTEMES DYNAMIQUES 


127 


vante. Si T* (2«* <M >ne s’dtend qu’une distance finie au-dessous de 2**“’ vers la 
droite, il existera une ligne verticale SI* le plus possible a la droite qui s'tftend 
de l'axe de <p jusqu'a un premier point de T*(2«*‘**)- Cette ligne SI* ne peut 
avoir aucun point interieur en commun avec , comme montre la m6me 
figure. Faisons maintenant la transformation TV*. Le point TV* (S) sera situd- 
a la droite de S, et 1'image TV 1 (SP) s'dtendra k la droite de TV* (S) jusqu’a 
l'ensemble S.*'*’ sans le croisant. Nous voyons done que dans ce cas l'aire 
inclue par l'axe de cp, SI* et T* serait transformce en une partie 

d’elle-m&me par TV\ ce qui est aussi impossible. Mais si T*^*'**) s’dtend 
inddfiniment vers la droite au-dessous de il formerait la frontiere 

d'une espece de boucle qui contiendrait dans son interieur. et ce 

boucle serait transforme en u.ie partie de lui-meme par T* *, d’une mantere 
analogue- 

Nous concluons done que T* (O) doit correspondre k un point ou un 
intervalle du cercle T qui ne precede pas 0 cycliquement, et par conse¬ 
quent que ko %— 2 /*^ 0 , ce qui est impossible. Il faut done que o« ne soit 
pas moins que 0. 

D une maniere tout k fait analogue, la consideration de la transforma¬ 
tion T 1 au lieu de T, nous montre que — o« doit etre plus grand ou dgal 
a —0; en effet T*‘ fait tourner toutc direction radialc vers la droite. Il 
s’ensuit que est aussi au plus egal a 0 . 

Mais o* ne depend pas de la region r<o<» 0 < l ue nous choisissons. 
Nous pouvons done romplacer r 0 par une tres petite valeur q sans modifier 
oa . Si le rdsultat dnoned n'a pas lieu, o« exeddera 0, ct il existera des en- 
tiers k, I (k > 0) tels que 


k& < 2/jt < A’Oa . 

Done nous aurons ko* — 2/.-t > 0. En considdrant la transformation T* comme 
auparavant (voir la figure ci-jointe), il nous est Evident, puisque T§ (O) suit 0 
en ordre cyclique, que T* (QK) est situd a la droite de QK et au-dessous de 



572 



o. 


D. B1RKHOFF 


128 

v a *<*>, et que T,(2«*‘ M ) s^tend indefiniment vers la droite au-dessous de 
Mais la transformation T* est conservative, et par consequent l aire indue 
par 1 ‘axe de <p, OH et ne peut pas etre transtormec en une partie de 

cette aire. La figure pr^c^dente nous montre done que T*(Sa* <A ') doit sortir 
de cette region le long de la ligne OH. 

Souvenons-nous maintenant du fait que le choix de A et / ne depen 
pas du choix de q. En prenant q tres petit, la transformation T transformera 
chaque point de en un point qui est a une distance a peu pres 2/* — AO 

plus a la droite. II s ensuit que pour q assez petit la relation indiqude dans 
la figure ne peut pas avoir lieu. 

Done nous devons avoir o« = 0. Si l'hypothesc analogue est remplie 
par les ensembles 2 », oh conclut o. = 9 de la meme maniere. 


4 . Les ensembles 2« et 2« k branches et sans branches ('). 


II y a une tres int^ressante bifurcation des cas. II peut arriver que le 
* point fixe lui-m£me doit etre compte parmi les points accessibles de 2 «. 
C'est k dire que dans le plan des coordonn^s u, v on peut trouver un arc 
de Jordan qui s’etend de la region ext^rieure jusqu'au point fixe sans ren- 
contrer sauf en ce point. Nous dirons en ce cas que le point fixe est 


« k u branches »; en tout autre cas nous dirons que ce point est « sans « 
branches ». Ces definitions splendent immediatement aux points « k u> bran¬ 
ches • et « sans u> branches ». 

Supposons que 0 soit un point fixe k u branches. Nous pourrions alors 
trouver un tel arc de Jordan qui s’etend d un point A du cercle donnd jus- 
qu’au point fixe. En considdrant r, «p comme des coordonn^es rcctangulaires 
et en nous souvenant de la structure de 2 « nous voyons que cet arc de 
Jordan doit s'etendre indefiniment a la droite, tout en s'approchant de 1 axe 
Jc <p uniformement. Dans la figure ci-jointe deux representations contigues 


r 



(«) J'ai obtenu quelques faits speciaux sur le cas a branches (en partant d’une autre dt- 
linition) dans S. T. 


573 


NOUVELLES RBCHBRCHBS SUR LE9 SYSTKMES DYNAMIQUBS 


120 


de 1'arc AJ de Jordan sont indiqu^es ainsi qu'une partie 2a* de 2a du type 
considdr^ au-dessous. Par consequent, la m^me figure nous montre que 2a 
et tout sous-ensemble de 2a qui est fernu* et connexe au point fixe, doivent 
poss^der la mfime propriety de s’approcher uniform^ment de I’axe de <p 
vers la droite. En retournant au plan des coordonn^es u, v nous concluons 
les faits suivants. 

Dans le cas elliptique instable non - degenbre, si le point fixe est ten point 
a a branches, tout ensemble 2a ferme et connexe au point fixe aura la profirieti 
que r s'approche uni/ormement de zero quand <p tend vers — oe, tandis que <p 
est borne super ieuremenl (inferieitrement) par tout. 

Un resultat analogue a lieu pour un point fixe b co branches . 

Sur le cercle T que nous avons employe pour representer les points 
accessibles de 2 a dans la derniere section, le point fixe parait une fois pour 
chaque voisinage exterieur de 2». Mais l’image par T d'un de ces voisinages 
du point fixe est aussi un tel voisinage. Nous nous apercevons done qu’il y 
a un ensemble des points representatifs sur T qui est transforme en lu*-m£me 
par T o\j T'*. Par consequent nous aurons o« = 0. 

Dans le me me cas si le point fixe est un point b a (to) branches le coeffi¬ 
cient de rotation correspondant de l'ensemble 2a (2«) est toujours done il y 
a pour 0^:0 un nombre infini des voisinages differents du point fixt 0 par 
/esq nets ce point se trouve accessible (*' 

Un resultat analogue a lieu pour un point fixe a to branches. 

Evidemment ces voisinages accessibles du point fixe sont ordonnes cy 
cliquement autour de 0. Si V, et V, sont deux quelconques parmi eux dont 
V, precede V, en sens angulaire, on peut construire un arc de Jordan V, V, 
qui les joint et qui ne rencontre pas 2 a * il pourrait arriver que cet arc doit 
sortir du cercle r = q 0 . La partie de 2a inclu par l'arc V, V, aura un certain 
maximum de distance it <p = 0; designons-le par d (V t , V t ). Il est visible que 
«/(V, V.) ne depend pas de l'arc V, V, qu’on choisit. En outre nous voyons 
que si V„ V„ V, sont en ordre cyclique croissant, d (V, V,) est toujours egal 
au plus grand des (f (V„ VJ et </(V t , V,). 

Soit maintenant W un point accessible quelconque (*) de 2a autre que 
le point fixe 0. Trois cas seulement sont possibles: (i) W est situe dans un 
intervallc |W], termine par deux points adjacents V, et V, correspondant a 


(') l.c nombre 0 0 n’est pas commensurable avec 2 .n. 

I*) Plus exactement, un de ses voisinages. II est avantageux ici d'employer la representation 
sur le cercle I*. 


574 




130 


Q. D. B1RKHOFF 


deux voisinages accessibles adjacents du point fixe; ( 2 ) l’intervalle JW] est 
termini par un des points V d’un seul cote; ( 3 ) [W] n est pas termin^e par 

un point V. . 

Dans le premier cas il existera entre V, et V, une seule « a branche * de 
(par definition) qui s’etend precisement k la distance d (V,, V f ) de <p = 0. 
Dans le deuxieme cas, si V est le point a un cote, et V,, V„ V„ ... une suite 
des points qui s’approchent de W de l’autre cote, il existera une limite in- 
terieure de d (V*, W) pour lim n = 00 . Cette limite ne peut pas se reduire 
k zero puisque l’arc V„ V doit incluir le point W au-dessus de <p = 0. La a 
branche correspondante ici est le sous-ensemble fermd connexe de 2 « inclu 
par tous les arcs V„ V. Evidemment cette branche se trouve completement 
ddfinie de cette manure et appartient a 1'intervalle (W). Dans le troisieme 
cas, en considerant les arcs de Jordan V. V.', ou V„ et V n ' sont des points 
sur les deux c 6 tes de W respectivement, on obtient une a branche limite 
[WJ correspondante. D’ou les conclusions suivantes: 

Dans le meme cas si le point fixe est un point & a branclus, tout point 
accessible de 2 « autre que le point fixe 0 est situe sur une seule a branche de 
, qui est fermee, connexe h 0 et telle que la distance radiate de ses points 
tend vers ziro quand f angle <p tend vers — > 0 . Ces branches sont en nombre 
infini pour 9^0 «/ or donates cycliquement autour de 0. Elies sont tracts formes 
entre elles-memes par T, de fafon que le coefficient de rotation par T est le 
nombre 0 de la forme nor male ( 6 ). Entre deux quelconques de ces a branches 
on peut insirer des arcs de Jordan qui sont exterieurs a 2« et qui s'etendent 
jusqu'au point fixe. 

Un risultat analogue a lieu pour un point fixe h a> branches. 

11 y a une autre mdthode de definition qui nous permet d'obtenir des 
resultats analogues pour les points sans a branches. Prenons un point W (0) 
de l'ensemble 2 « et construisons un petit cercle de rayon q autour du point 
fixe 0. Chaque point tel que W en dehors de cet cercle ddfinira un sous- 
ensemble 2«(p, W) de compose des points de 2« connexe k W en de¬ 
hors de ce cercle. L’ensemble 2 (p, W) est ferme et connexe, et contient le 
point W et au moins un point sur la circonierence du cercle de rayon p. 
Quand q tend vers zero, l'epsemble 2 «(p, W) s'epanouit et tend vers un en¬ 
semble limite, 2.(W), qui est connexe k 0 et contient W. Evidemment, si 
2«(W) et 2«(W') ont un seul point en commun, ils doivent coincider com¬ 
pletement. Il est evident aussi que ces ensembles sont transformes entre eux 
par T et T' 1 . 

Si le point fixe est un point k a branches chaque branche doit consister 
d’un ensemble des ensembles 2« (W). 


575 



NOOVBLLRS RECHERCHES StJR LES SY3TEMBS DYNAMIQOBS 


131 


Malheureusement rien ne nous permet d'affirmer dans le cas g^n^ral 
que ces ensembles sont fermds ni qu'ils sont arranges en un ordre cyclique 
d^fini. 

Dans le menu cas, si le point fixe 0 est sans a (to) branches, on peut rat¬ 
tacker h tout point W^fcO une espice de brancle 2a (W) (2 m (W)) qui consists 
en tons tes points dc W conruxe & 0 sans passer par ce point. Tout point de 
2 « ( 2 m ) apparlient a un settle ensemble de cette espece et ces ensembles sont trans¬ 
formes entre eux par T et T ‘. Afais il riy a aucune raison b croire que ces 
ensembles soient Jerme's sauf au point fixe ni qu'ils soient arranges en un ordre 
O' clique naturel autour de ce point . 


5 . L’interrelation des ensembles 2a et 2 U . 

Les rdsultats suivants sont maintenant ^vidents: 

Dans tout voisinage (Cun point fixe du type elliptique general les ensembles 
2 » et 2w s'entrelacent de sorte quits ont un nombre infini de points en commun 
correspondant aux solutions homoclitus a la solution ei-plriodique (tonnee. Ces 
ensembles 2 a et 2 - en ton rent compUtement le point fixe en de pelits voisinages 
de ce point (*). 

Si le point fixe est un point a a branches , a us si bien qub a> bratuhes, on 
peut faire correspondre a ces branches des points P et 0 par taut denses sur 
deux cercles Ta et , de fafon que les deux families de branches sont or - 
donnees cycliqiument comme les points P et Q correspondants, et divisent le plan 
en une esplce de reseau. I^a transformation T transformera ce reseau en lui- 
mcme, en avanfant les points P et Q correspondants en sens angulaire avec le 
coefficient de rotation 9 (voir la figure ci-jointe). 



(•) Voir S. T., Chap. 2. 


576 



132 


O. D. niRKIIOFF 


6 . Les solutions periodiqpes voisines. 

La forme normale (6) nous montre pourquoi la transformation conserva¬ 
tive T associde & une solution /*-p<5riodique du type general non-ddg^nere 
est a peu pres une rotation par un angle fl er ,M (r ;> 0) autour du point 
fixe de T qui correspond h la solution periodique. Plus exactement en des 
coordonnces convenables u , r, nous pouvons ecrire pour 


(7) 


r* =* r -f U K (r, «(>), 

<Pn = <P + 4- mw* - 4- ‘I'm 0\ *p)» (0 < 0 < 2;i), 


on 


Hk 

u 

’ I " 


•>R, 1 

I 




» 

V 

' i’SF * 

-dr 

» 

>P 


<1 ir* 


avec p arbitraircment grand, pourvu seulement que 


(8) I » I < L r* 

Par consequent* pour des valcurs convenables de l'entier k, la quantite 


<p* — q> — ikn 


croitra avec r d une valeur negative jusqu'a une valeur positive ; nous allons 
supposer aussi que k et n n'aicnt pas de facteur en commun afin de simplifier 
renonce des resultats. II est evident qu'il existe une courbe D sur laquelle 
regalite <p„ = <p — 2A-* a lieu, ct que D est coupde une fois seulement par 
chaque rayon issue de 1‘origine. Mais l’aire inclue par I) doit etre egalc a 
celle inclue par la courbe transformee T" (D), et toutes les deux courbes ren- 
firment le point fixe. II s’ensuit que les deux courbes auront au moins deux 
points en commun, l\ 0- Puisque tous les points de I) sont transformes ra- 
dialeinent par T” en les points correspondants de T" (D), V et 0 doivcnt etre 

des points fixes par la transformation T*. Mais en ce cas I’, T(l’)-T n “'(l > ) 

et Q, T(Q)» • • • T*'*(0) seront aussi des points fixes par T*. 

II existe done un tiombre in/ini des solutions periodiques dans le voisinage 
immediat de toute solution ri- ou es-periodique non-degencre du type general . 


577 


NOUVBLLKS RBCHERCHBS SUR LBS SYSTBMBS DYNAMIQDBS 


133 


Nous avons employ^ ici le lemme suivant tout & fait Evident: 

Lemme. Si en des coordono^es polaires r, <p, une transformation conser¬ 
vative T:r t = /(r,<p), <p, = y(r, tp) avec l'origine pour point fixe, est telle que <p, 
croit avec r d’une valeur negative ir = 0 jusqu'i une valeur positive pour 
»= r o >0, il existera au moins deux points fixes de T pour r <r 0 . 

Voil& un Lemme qui joue le role du dernier Th^oreme de G^om^trie 
de Poincare au voisinage immediat dune solution «?-p£riodique ('). Nous le 
mentionnons parce que nous verrons en le Chapitre suivant qu'il y a un 
Lemme correspondant qui s'applique au « voisinage etendu » d'une solution 
/i-periodique. 

Poussons maintenant un peu plus loin la discution des solutions p^rio 
diques voisines. Les in^galit^s pr^c^dentes nous montrent que pour des 
entiers k et n convenablement choisis la transformation T" fait tourner les 
points d une telle courbe D.. a dun angle preeminent autour du point fixe, 
ou a n est couple qu une seule fois par chaque rayon et se trouve donn^e 
asymptotiquement en position et direction de sa tangente par liquation 

_ tkn - nd 

e 


11 est Evident que tous les points P tels que T* (P) = P pour lesquels 
1'angle correspondant de rotation est 2 kn/n sont situ^s sur D.. a. Dans la figure 
ci-jointe nous rcpr^sentons cette courbe dans le plan des coordonndes po¬ 


s 

r r ,. 

.Y 



__ jT^fD mk 

• 


t* 

.••. 


Y 

Fig. 13. 



laires r, <p. A partir d'un point P de cette espfcce nous trouvons aussi n — 1 
autres points T (P), T* (P),... T*' 1 (P) sur la m$me courbe !>-.*. A vrai dire la 
relation T n - (P) = P (n, < n) nous donne imm^diatement T d (P) = P oil d est 
le facteur en com.nun de n et »*. Par consequent, si T* fait tourner P par 


•) Je r* K arde ce Lemme. quelque simple qu’il soil, comme la forme primitive des th*o- 
r£mes gintraux de ce genre. 

Serie III, Vol I. 10 


578 





134 


O. D. BIRKHOFP 


1-angle ik*n, la transformation T le lait tourner par ***** oil n-Xd. Mats 
en ce cas * et n possideraient un factcur en commun, ce qui contredtrait 

notre hypothese initiale. .... . T -/n \ 

Laissons de c6t<5 le cas le plus simple de tous, celui ou la courbe t (U-. 
coincide avec D. s. En ce cas chaque point de D„. . doit Stre un point in- 

variant par T*. - 

Dans tout autre cas il existera un nombre fini nl des points hxes que 

nous ecrivons dans une table k double entree: 


P„T(P.) ... T~*(P.). 

P*» T(P») ..* T“ * (P«) • 

Pa T(P,) ... *r‘(P/) . 

La transformation T fait tourner chacun de ces nl points approximativement 
par Tangle tkn/n. 

En faisant croitre <p de 0 k 2x, on doit trouver sur D«. * successtvement 
p o ... P /t c’est k dire un des reprdsentants de chaque ligne- Le point T*(PJ 
de la premiere serie doit suivre le point P, de cette serie, oil pk = 1 (mod n), 
et ainsi de suite. 

A chacun de ces points P„ ... Pii, la courbe D.. a ct son image T"(D„. *) 
doivent se couper, et elle ne se coupent pas autrement ( voir la figure 
au-dessus). 

Les formules asymptotiques ne donnent aucun renseignement sur la 
yaleur de /, qui peut en effet fctre infini, comme nous Tavons deji dit. 

Pour alter plus loin, rappelons la definition de « Tindice » d’une solution 
periodique quelconque pour un tel point P. En faisant decrire k Q une pe¬ 
tite courbe autour de P dans le sens positif. Tangle du vecteur 0 * T”(Q) 

decrira un angle 2in ou i est Tindice susdit. Pour une solution e-periodique 
du type general, Tindice doit £tre -f- 1; pour une solution A-periodique du 
type general pour laquelle les racines e. 1/e ne sont pas negatives, 1indicc 
doit Stre — 1 ( voir le chapitre III). Nous pouvons supprimer la possibilite 
e<0 ici; en effet la forme asymptotique au voisinage du point invariant 
(r 01 <p 0 ) nous donne asymptotiquement 


Ar, = Ar 4- ... 

Acp, = Atp -j~ 2mncr 0 "‘* Ar . . . 


579 



NOUVRLLBS RECHBRCHBS SUR LBS SYSTEMBS DYNAMIQUBS 


135 


oil Ar et A<p indiquent les coordonn**es de (r., <p„) par rapport k (» 0 , <P 0 )- Par 
consequent, dans le cas actuel pet 1 /e sont voisines de Tunit£, et ne peuvent 
pas etre negatives. 

11 s'ensuit imm^diatement que la somme totale des indices des mouve- 
ments p^riodiques stables et instables precedents se reduit k 0 . En effet 
faisons decrire au point 0 un rectangle c*p Y 6 de la largeur in du type indique 
dans la figure au-dessus, qui renferme tous les points P„...P,. Conside* 
rons Tangle que la direction 0 —► T"(Q) fait avec Taxe de <p. Sur le c 6 te 
superieur y 6 ce vecteur est dirige vers la gauche avec une direction presque 
horizontal; sur le cote inferieur t»p il est dirige vers la droite; sur les cotes 
opposes py. les rotations de ce vecteur sont egales mais des signes con- 
traires. Done la rotation totale est petite en valeur absolue et doit £tre nulle. 

Considerons de la mime maniere un rectangle a'pY&' d une largeur 
assez petite pour qu’il ne contienne qu’un seul point P. Quand le point Q 
mobile parcourt a'p' et y'&' la rotation en chaque cas est petite comme dans 
le cas analogue trait** au-dessus. Sur chacun des deux cot£s opposes PY 
et 6 'a\ les images T"(pY) et T* ( 6 V) ne sont couples qu’une seule fois par 
un rayon r = const.; nous nous appuyons sur le fait que <p- — q> croit avec r. 

11 s’ensuit que les deux rotations qui correspondent k PY et <*'&' sont pr 6 s 
de a ou de — n, suivant que Timage coupe le rayon correspondant au- 
dessus du point de D„. * ou au-dessous de ce point. Done la rotation totale 
nc peut etre que in ou 0 ou — 2 k, et l indice correspondant est ou + 1 
ou 0 ou — 1. De plus, en partant dun rectangle o PY*' inclut deux ou 
plusicurs points fixes adjacents. on voit que les indices + i et — t. doivent 
se succ^der, abstraction faite des points fixes d’indice 0 . 

Reinarquons maintenant que dans le cas d’un point fixe simple, 1 indice 
doit etre -f 1 ou — I selon que le point est elliptique ou hyperbolique. A 
vrai dire, l indice de tout point fixe elliptique est + 1 , comme le rend raison- 
nable le fait que les courbes presque invariantes sont ferm^es approximati- 
vement; et de la meme fagon l indice de tout point fixe hyperbolique est pr£- 
cis^ment 1 — Ar, si k branches asymptotiques passent par ce point (‘). Par 
consequent les indices 4 - 1 , — 1 se succd-deront alternativement, abstraction 
fait des points d’indice 0 ; les points fixes d'indice — 1 sont simples et au 
ront deux branches asymptotiques; ceux d’indice 0 seront hyperboliques et 
auront une seule branche. Ces derniers peuvent fitre regard^s comme des 


(«) Ces fails sont des consequences imnitdiates des rtsultats de ce ‘.chapitre-ci et du 
chapitre suivant. 


580 



O. D. BIRKHOFF 


130 

points doubles formas par l’union d'un point hyperbolique simple et d’un 
point elliptique simple. 

Avec cette convention nous obtenons le rdsultat suivant: 

Done pour k et n ainsi choisis par rapport a une solution penodique el¬ 
liptique non-degenire du type general, il existe au moins I ^ 2 solutions perio- 
diques correspondantes qui coupent la surface de section locale aux nl points 
distincts 

P,.T(P,), ... T— (P,) («-1, ••• 0 

de la courbe D„. *. Ces nl points sont disposes sur D*. * dans le meme ordre 
cyclique que les angles correspondants 

, , 2*(n — l)n , , # * 

«/. °H-IT" • • • • H-(mod 2 n). 

Tous ces points fixes de T" sont simples et altemativement hyperboliques et eltipti- 
ques, exception faite des points fixes doubles avec une seule branche asymptotique (‘). 

On voit ainsi qu'il existe des groupes assez reguliers des solution e- et 
A-pdriodiques dans le voisinage immddiat de toute solution F-periodique du 
type general non-d^g^n^re. Ceux-ci jouent le role des satellites inevitables 
de la solution periodique donn^e. 


7 . Extension aux autres solutions e-periodiques. 


Nous allons maintenant demontrer que les risultats ainsi obtenus admettent 
des extensions it tout cas elliptique autre que le cos degentre signale plus /taut ou 


T 2 "/ 3 /o zfz se reduit formellement a I'identite. 


En effet, tout le raisonnement employe dans le cas general depend de 
l'emploi des variables polaires r % <p avec les deux propridtes suivantes: (i) 
la transformation T s’exprime dans la forme 


*•. = /■ (»-, <P) . <P, = <P 4* 9 (*\ <P) » 

oil f et g sont continues pour r = 0, analytiques pour r> 0, periodiques de 
periode 2n en cp, avec f (0, <p) =* 0, g ( 0 , <p) = const, et telles que les image >• des 
lignes radiales sont toutes tournees vers la gauche de la direction radiale 


(*) A vrai dire, le cas oil D„. k est une branche analytique de points invariants, doit fctre 
inclu aussi. 


581 



NOOVELLBS RBCHBRCHBS SDR LBS SYSTEMBS DYNAMIQDES 


137 


par T: (2) la transformation T" pour n suffisamment grand transforme tout 
segment radial cp = const. 0 ^ r <| d en une courbe qui s’^tend (en sens 
angulaire) ind^finiment vers la gauche du point ou elle rencontre r = 0. 

Ce sont les deux propri£t£s qui seulement sont essentielles dans l’ana- 
lyse qualitative qui pr£c6de. 

En tous cos, soil F («, u) un polynotne en u, v dont ie developpement en u el 
v coincide avec celui de la serie invariante F* (u, v) jusqu a des termes de degres 
assez clevis, hi les coordonnes », v sont choisies telles que la fonction, 0 (m, v) de 

I'itrvarianl intigral f J'Q dudv se reduil h 1 idenliquement. Alors des coordon- 

nees quasi-radiales r* % cp* jouissant de ces deux proprietes sont Journies respec - 
live went par la const ante C r attaches a la courbe permit F = C passant par le 
point dontti el par le rapport inn/k (C) ou x indique la valeur de tintegrate 
en bas, prise du point ou l axe de u coupe cette courbe jusqu au point (u,v) 
considcre, tandis que k{ C) indique la valeur complete de cette integrate 


C du C dv 

J “~J r - ■ 


Remarquons que dans le cas general ces coordonn^es sont essentielle* 
inent les coordonn^es polaires. E11 effet nous pouvons prendre F = u* v' 
en ce cas, ce qui donne 


r* = r* 


— <p/2 . 


Commencons avec le cas le plus simple ou pr — 4 q* = 0 mais p* + q % + r *=£ 0 
{voir 1'equation (4)). En ce cas par une transformation lin^aire des variables 
nous pouvons £crire 


(/) 


F* — pu* -I- qu • -4- ru' r + rub 1 + It* -f . . . 


0>9t0) . 


Pour trouver les facteurs formels de F* il faut consid^rer le polygone de 
Newton correspondant (voir la figure ci-jointe) dont le point (2,0) doit £tre 


(0.4) • n 


( 2 . 0 ) 

Fig. 14. 


m 


582 




138 


Q. D. BIRKHOFK 


marque Si le point (0,3) y entre aussi (*#«). nous aurions nfeewmment 
une courbe invariance rfelle de la forme asymptot.que c = (-p« /') + 

c est a dire un cas hyperbolique. Done il faut que (0,3) ne soil pas un point 

maraud (f = , 

Nous pouvons aussi exclure le cas de deux c&tds de notre P°’yK° n< ; 

puisque chacun de ces cot^s qui devrait joindre (2,0) a (1,a) et (1,a) ( ,P) 

donnerait une courbe invariante rdelle- Done il n’y a qu un seul cotd. 

Si maintenant il n’y a que deux points marques sur ce cot£, les eux 
points (2 0 ), ( 0 , 2 m) le dyfinit, puisque la possibility d un second point ( 0 , 2 m -f ) 
nous donne une courbe invariante ryelle. Les deux coefficients correspon- 
dants de F doivent £tre du meme signe. et en ce cas nous pouvons yenre 

( g'\ F* = pu % + qv tm -f • • • (M > °) » 


ou les autres termes n’entrent pas effectivement. Considyrons preincrement 
ce cas qui est le plus simple possible. 

Soil F un polynome en w et dont le dyveloppement en syrie est iden- 
tique avec celui de F* jusqu’i des termes de degry ires yiy V y p. Dyfinissons 
une transformation conservative I au moyen de la solution (m, v) des 6qu&- 
tions diffyrentielles, 

du t¥ dv _~>F 

(9) dk “ tv * dk tu 


pour laquelle i< = « 0 , v = r 0 pour Ar = 0 ; les syries pour w„ dyfinissant I 
sont obtenues en faisant I et seront identiques avec les syries dyfinis^ 
sant T jusqu'aux termes de degry p au moins. Done la transformation I 
imite T tres ytroitement; plus exactement la distance du point T (P) au point 
T(P) n'excede pas Kr*\ r ytant la distance radiale de P & l’origine. Les courbes 
F = C sont individuellement invariantes par la transformation T. 

Dysignons maintenant par k (C) le nombre d'ityrations de 1 qui sont 
nycessaires pour que le point (u, 0 ) fasse un tour complet dune courbe 

F = C. 

Les yquations diffyrentielles ( 9 ) nous donnent 

(IO) * (C) = = _ j~T: = ?F?+TV 


583 



NOOVELLBS RECHERCHES SUR LBS SYSTEMES DYNAM1QUES 


139 


ou ds = f'du* + dv*. D’autre part si A (C) d^signe I’aire inclue par la courbe 
fermde F = C, nous avons ^videmment 


d\ (C) 


f d " ds 


oil dn d^signc la distance normale de la courbe F = C a la courbe F = C-{-rfC, 
qui est dC/(/F M * FV. Nous obtenons de cette maniere la relation suivante: 


(ii) 


*(C) 


dAiC) 

dC 


Apres cette remarque gein^rale, revenons au cas special oil F* et done F 
sont de la forme (8). Modifions les variables u et v en ce cas en dcrivant 
it = v = vC ll2m ('). L’aire A (C) incluse par la nouvelle courbe correspon- 
dante 

p7S* + iH-+ ... =1 

est ^gale k A (C)/C‘“* Nous voyons aussi que A (C) s’exprime par une 

serie des puissances de C , , -\ done que 

A (C) - (a 0 + a. C'*" + • • • ) K > 0). 

Remarquons en effet que pour C =— 0 la courbe en r7, v se reduit k l’ovale 
-\-qv ,m = 1 , et qu elle varie analytiquement avec C ,2 ~. 

Liquation (ii) nous donne maintenant 

(>*) k (C) - C—a. + a, C W- + . . . ) . 

et nous voyons done que le nombre k (C) derations de T n^cessaire pour 
que I* fasse le tour complet de F = C croit constamment et tend vers l'infini 
comme 0~"pour lim C = 0 . Consid^rons maintenant les variables indi- 
qudes r*, <p*. En ces variables la transformation I* prend dvidemment la forme 

('3) r,* = r* = C, = + 


(') Nous supposons que C > 0. Dans le cas oCi C < 0 le long des courbes fermies il faul 
remplacer C par — C. 


584 




140 


o- D- B1RKHOPF 


et l'angle de rotation de f dicroit constammement avec la distance radiale 
% puisque, selon (ia), *(r*) croit indifinimen, quand r* tend vers 
e „ ces memes variables r* *». la transforma.,on 1 joult auss, des propndtes 
inoncies comme montre l examen ditailli tont 4 lait direct ( )- 

Passons nontenant 4 l’au.re cas ou le polygons de Newton possed 
un seul coti avec trois points marquis (*. 0), (!.■)• (°- *)• a orm 
F* est maintenant la suivante: 


F («, e) = ;■«* + 1" v ~ + 


(;»/'• ¥= °), 


„u les deux racines caractiristiques de l iquation pe’ + '/P + ’‘ 

atre negatives dans le cas eUiptique. Si nous supposons que ces racmes nc 

sont pas igales. le meme raisonnement sapplique encore sans mod.ficat.on. 

II ne reste que le cas d une racine double rde.le, 0 = - »/*/■• Ma.s en 
ce cas, si l’on introduit de nouvelles variables 


u =» U — 


yr_ 

*/> 


on parvient 4 une expression pour F dont les termes qu. su.vent pu son. 
d un degri plus grand que m + 1. On peut maintenant recommencer le meme 
procidi. Oe cette maniire il parai. que dans le seul cas qui reste on peut 
ilever indefiniment le degri moindre des termes en K* aprcs le terme pu . 
Mais cela revient 4 dire que F* admet une courbe rielle 


comme courbe invariante double, ce qui n est pas possible dans le cas el 

liptique. . . 

Par consequent il existe de telles coordonn<*es »*. <P* toujours si I* con 

tient des termes quadratiques. Nous pouvons done admettre que F* com 
mence avec des termes de de^n* au moins quatre puisque dans le cas el 


(«) En effel la transformation T consist® en la transformation T suivie d une transforma¬ 
tion U oil U est presque l’identit* pour de grandes valeurs dc p. Plus preeminent U est de 
la forme 

M| = a + 1*^ («. v). v = v + Gp.(u.u) 

oC les d6veloppements en sirie de F* et G,* commencent avec les termes de degrt p au moms 
en u et t». 


585 



NOO'/BLLBS RECHERCHBS SDR LBS SYSTEMBS DYNAMIQDBS 


141 


liptique le moindre degr£ des termes qui entrent dans F* doit etre pair. 
Nous sommes ainsi conduits k consid^rer le cas oil 

F* (u, v ) = pu tm QU tm l v . . 

ou le polynome F^ (u, t>) de degr^ 2 m (m> I) avec lequel la s^rie commence 
nous donne une Equation caract6ristique F^fe, t )-=0 dont toutes les racines 
r^elles doivent £tre ou negatives ou positives d'une multiplicity paire. Saul 
dans le cas ou entrent de telles racines multiples q^O, nous pouvons ap- 
pliquer la mgme maniere de raisonnement dans ce cas. 

Pour yviter les difficult^ qui peuvent entrer dans ce cas exceptionnel 
nous raisonnons de la maniere suivante- 

Dans la forme de F* que nous venons d'^crire nous pouvons supposer 
que le terme en »*" entre efFectivement. Pour achever cela il ne faut que 
laire un changement convenable lindaire des variables. En ce cas liquation 
F (u, r) = C dyfinit v comme une fonction alglbrique de u et de C sur une 
surface de Riemann k 2m feuillets, qui reste finie sauf au point u == -o. En 
outre les coordonndes u /t des points de ramification satisfont aux deux 
Equations F ( u, v ) — C, F. (u, r) = 0. Cela nous montre que dans le voisinage 
de C * 0, ces coordonn^es i//(i —■ 1,2,.../) s’expriment par des series en des 
puissances de C 1 '* (A C 2m). Pour C assez petit mais pas 6gal k z6ro, quel- 
qucs-unes de ces series differentes nous donnent des points distincts qui pour 
C = 0 coincident k l'origine. II s'ensuit que cette surface de Riemann a pour 
| u | ^ e une certaine connexitd q ( | C | ytant petit). 

Evidemment dans ces circonstances les formules (to) nous montrent que 
k (C) pour chaque valeur rdelle de C n’est qu’une integrate alg^brique prise 
le long d une courbe ferm^e dans cette surface de Riemann. Cette integrate 
sera analytique en C sauf peut-fttre k l'origine elle-m£me. Done, en dcrivant 
C = e 2n y ~ lD la fonction k* (D) = k (e 2 * *'—• D ) sera analytique en D, au moins 
si la partie r^elle de j/— I est negative et assez grande. 

Plus gynyralement, soit I, (D), I, (D), ... I r (I)) un systeme fondainental de 
telles integrates pour | u | ^ e. Nous aurons 

A* (D) = «. 1, (D) + ... + 11(D) 

ou i t , ... »r sont des entiers. Nous allons 6tudier la nature g^n^rale de ces 
lonctions I, (D), ... \ r (D). II est Evident que l t (D),... l r (D) sont analytiques 
pour la partie rdelle de \f — 1 D negative et assez grande. Si nous construi* 
sons une coupure le long de l'axe positif dans le plan de la variable com- 
plexe C, e’est a dire si D reste k la gauche de l'axe des imaginaires dans 


586 




142 


Q. D. BIRKHOFK 


le plan de D, ces fonctions considdrdes comme des (onctions de C 

auront des valeurs d^termin^es. 

Les / points u t ,...u t qui varient alg^briquement avec C auront des or- 
dres bien d£termin<*s en C, et se sdpareront en des groupes ayant le mfime 
ordre. Ceux du merne ordre tendront vers des configurations g^ometriques 
limites, dont les orientations relatives k l origine changent avec 1 ’argument 
de C. 

Construisons maintenant autour des points u t de petits cercles avec des 
rayons C*‘. Si l’ordre / est suffisamment grand, ces cercles ne contiendront pas 
d’autres points «, et leurs distances 4 de telles points seront des ordres 
plus petits. Consid^rons maintenant le minimum de | F„ I pour | u | <, e en 
dehors de ces cercles pour une valeur donn^e de C^O. Puisque I-* est ana- 
lytique dans cette region elle aura son minimum positif sur la frontiere. Mais 
pour | m | = e ce minimum ne peut pas etre petit puisque F* ne s dvanouit 
pas sauf aux points de ramification. Done la valeur minimum aura lieu sur 

un de ces petits cercles. _ _ 

Si nous ^crivons u «= x -f |/— 1 y, u, = x t -f 1 yi et G = | F* |\ les 

deux Equations 




(*-*') “ (v = ° ’ 


7><i 




auront lieu k un tel minimum pour au moins une valeur de l'entier i^Mais 
yi sont de fonctions alg^briques reelles de C, et C, oil (’. = C, -f (/ t C t . 
Done un tel point de minimum se trouve n<*cessairement parmi un certain 
nombre de points y* dont les coordonn<ies sont des fonctions alg^bri- 
ques de C, et C f . Par un raisonnement analogue le minimum de | F* j* sur 
un cercle donn£ C* + C,” — | C I" dans le C n C, plan sera atteint sur l une 
de certaines courbes alg^briques passant par l'origine II s'ensuit que le 
minimum de | F„ | excedera K | C f pour d suffisament grand. 

Mais la longueur des courbes ferm^es d integration pour l n ...lr (prises 
hors des cercles) est ^videmment uniformcment born^e si C ne traverse pas 
1'axe r^elle de C dans le plan de C. On conclut done que I,,... U sont infe- 
rieures k L | C |* - si les constantes d et L sont suffisamment grandes. 

Supposons maintenant que la variable complexe C fasse un tour complct 
de l’origine. Chaque intdgrale I. est changee en une integrate analogue I. qui 
s’exprime lin^airement en I,,... I r 



J YO !/ 


I Ytf I 0 


587 



NOUVELLES REC1IERCI1ES SUR LBS SYSTEMBS IVYNAMIQQBS _ U8 

Ici les Y»> sont des entiers qui ne dependent pas de C. En introduisant la 
variable 0 ces Equations deviennent 

l/ , (D+1) = 2 Y< , l/*(l>) . 


Done les I,* (D) satisfont a un systeme des Equations lintfaires a differences 
finies avec des coefficients qui sont constants. La solution generate peut 
s’^crire lin^airement en termes de r solutions particulieres qui en general 
sont de la forme /* !> mais qui peuvent etre de la forme I) ' /* (A/ entier) 

si quelques-unes des constantes r, sont egales entre elles; les coefficients qui 
entrent doivent etre periodiques de periode I en D. 

Par consequent nous pouvons ccrire 


b* (D) = 2 pu* (D) D*> e c * D (» = !,... r). 

En remplagant I) par D -f I, D + i % ... D + r — 1 dans ces equations pour une 
valeur particuliere de i, il en resulte r equations lincaires algebriques en 
/>„* (D),... />*.* (D) qui peuvent etre resolues par rapport a ces r fonctions pe* 
riodiques. La solution explicite est lineaire en I/* (D),... I/* ((D + r — 1). Mais 
nous avons vu que toutes ces r fonctions deviennent infinies d'un ordre fini 
en | C I 1 quand C tend vers zero. En effet nous avons demontre que I/* (D) 
possede cette propriete ct les relations recurrentes ecrites au-dessus nous 
montrent que I.* (P + I), •• • l* (P + r — I) la possede aussi. Par consequent 
les formules explicites pour les fonctions periodiques /ty(P) nous montrent 
que ces fonctions sont aussi d'un ordre fini en De ce fait il s ensuit que 
les fonctions /></(C) =/>,>*(!)) sont ou analytiques ou tendent vers 1‘infini quand 
C tend vers 0. Par consequent A-(C) doit etre de la forme 

A- (C) = <iC rf + bC ' -f • • . (a 0 ; 0 < d < e ...), 


ou mane de la forme 

k (C) = V (log C) -f f O (log C) + . . . , 

P (log etant des polynomes en log C. 

En nous rappelant Ie role de liquation analogue ( 12 ) nous voyons que 
les coordonnees <p* auront toujours les proprietes enoncees. 


588 



144 


G. D. BIRKHOFK 


8 - Distribution des solutions *-periodiques. 

II est Evident que non seulement les solutions r-p^riodiques quetconques 
possedent des solutions e- et A-pdriodiqttes voisines, ma.s auss, q. » 
commencant avec ces solutions e-pdriodiques vo.smes qu. son pour a.ns. 
dire des satellites de ces solutions, on pent obtenir d autres ; 

riodiques et /,-pdriodiques qui son. des satell.tes seconda.res E ™ 
ainsi et en employant 1'extension gdndrale de nos rdsultats 4 toute solut.on 

r-elliptiuue non-d^genere, nous pouvons dire: . . 

L'ensemble des solution, e-piriodiques uon-dlgbnires est densl en lu, menu, 
exception foite des solutions e-penodiques mn-degene re dont les solution '-/*■ 
‘riodiques voisines sont toutes degrees, sily en a. Toutes s.s solutions posse- 
.lent aussi un nomire iufini des solutions K-piriod,qu,s vo.s.ms 

Done un rdsultat que i'ai obtenu autrefois settlement sur I hypothese qt.e 
toutes les solutions r-periodiques sont du type gdndral sc trouve ma.ntena 
dtimontrd sans exception. 

II ne faut pas simaginer que les solutions d^gdneres doivent poss^der 
des propridtds analogues. En effet. dans notre dernier chapitrc nous allons 
voir que ces solutions peuvent rentrer dans la categorie des solutions f-pc- 
riodiqt.es irrdgulieres oil il nexiste jamais des ensembles asymptot.ques qui 
s’entrelacent. 


9 . Sur la possibility de generalisation. 

Rien ne nous empiche de gcndraliser la thdorie formelle que i'ai deve- 
loppt! dans mon long article de I'Acta. En effet. grace 4 la forme spdc.ale 
d'une transformation de contact qui definit T (voir la section du chap.tre 
precedent) nous obtenons generalement 

(, 4 ) *2 (X/‘* da 1 " — Xy dxj) = d<f» % 


d’oii formellement pour k — 0 


2 m —2 / 2--2 

2 ( 2 
y-i \ y-i 


^ 6xydx f ) + \.dba = d\ , 


589 



HOtJVBLLBS RBCHBRC HES 9t)R LKS SYSTBMBS DVNAMlQOBS _HB 

ce qui donne 

*2* 6*,= */ (Y — 2X,5x,) = dl . 

j=\ \ -**/ / 

Mais les Equations differentielles formelles sont 

-^-- 8 */ (•' = «.")• 

Done ces Equations se d^rivent formellement de liquation de variation 



et sont pfafhennes en ce sens. 

De cette manure la panic formelle de mon M^moire s’dtend sans au- 
cune difficult** au cas de n quelconque. On pourrait mfime discuter la forme 
asymptotique de T dans la mfime maniere en introduisant des variables po- 
laires ou quasi-polaires r,*, q?,* ... cp~-i*. 

Mais jusqu’ici je n*ai trouv^ aucune propri<*t£ g4om<*trique pouvant 
remplacer la propri£t£ caract^ristique k deux dimensions (n —• f), que j'ai 
employee plus haut (')• 

CHAPITRE III. 

Les solutions hyperboliques et les solutions volslnes. 

1 . Les courbes invariantes dans 1e cas g£n£ral. 

Dans le cas hyperbolique g**n£ral les coefficients des termes quadrati- 
ques de la s<*rie invariante F* satisfont & l'indgalitd, /*" iq* <.0. 11 existe 
alors deux courbes analytiques invariantes qui se coupent au point fixe sans 
y Stre tangentes. Ce fait a ^t<* d£montr£ preincrement par Poincar^ (*) et 

(•) A cet fcgard, signalons Particle recent de de D. C. Lewis et de moi-mCme qui concerne la 
distribution des solutions «-p*riodiques voisines dans le cas elliptique g6n*ral: On 'the Periodic 
Motions Near a Given Motion of a Dynamical System. Annali di Matematica, t. 12, s*r. 4 
(1933*34). 

(*) Voir la quatri*me partie de son M^moire cit* plus haut. 


590 



140 


O. D. BIRKHOFF 


plus tard par Hadamard (•) et moi-mcme (’). On pent prendre ces courbes 
comme les axes de coordonn^es «, en faisant un cho.x convenable des 

variables. t . 

En de telles variables u, v , la transformation T s dent ( ) 

(,) „ 1 - eU (1+a 1 .« + a. lB + ...). + (e>,) - 

On conclut imm^diatement quavee l itdration rdpdtde de T les points sur 
l’axe de v tendent vers lorigine; et aussi qu avee Iteration de T" les points 
de l’axe de w tendent vers le meme point. La distance des points success.fs 
de lorigine diminue tres rapidement dans tous les deux cas. 

Cette forme (i) de T rend visible le fait quon a, pour tout point (*/, r) 
en dehors des axes, les indgaltes suivantes 

oil e > 0 est arbitraireinent petit dans le voisinage immddiat du point fixe 


2 . Un lemme. 

J’ai ddmontrd (loc. cit.) que dans ce cas hyperbolique gdndral il existe 
deux « formes normales . de T analogues k celles des Equations (6), (6 ) du 
chapitre precedent pour le cas elliptique gdndral: 

( 2 ) u t =9 h** * % Vt= ~Q Ve ~ C *' * (/ Un entier) * 

(2') 

Le cas ( 2 ') peut etre considtte comme le cas ( 2 ) avec / infini. 


(«) Sur Miration el let solutions asymptotiqnes des iqua lions different ielle s, Bulletin de In 
Sociit* Mathfematique de France, t. 29 (1901). 

(•) S. T., p. 45. 

( 3 ) A vrai dire, e P eul Ctre ntgative. tn cc cas nous considtrons T* au lieu de T; par 
ce procidi on remplace q par o* > 0 . 


591 



NOUVELLES RECHBRCHES SDR LES SYSTEMBS DYNAMIQUBS 


147 


L’importance de ces formes normales consiste en ce fait qu’on puissc 
trouver des variables actuelles *, r telle que T s'exprime dans une des deux 
formes assoctees, 

( 3 ) Ul = e «"" J (l + P,(«,c)), v, = y w~* H (I + Q. («<>)). 

( 3 ) b, = »(1 + P,.(«,»)), (1+0!*(«.")). 

ou les fonctions IV et Q«* sont analytiques 4 l'origine et sont donn<*es par 
des series qui commencent avec des termes de degrd arbitrairement <Mev£, p. 

II est done Evident que la transformation T imite la transformation corres- 
pondante ( 2 ) ou (2 ) tris dtroitement dans le voisinage du point fixe. 

Remarquons en premier lieu qu'on peut simplifier un peu plus les 
formes ( 3 ), ( 3 '). Consid^rons la transformation T sur 1‘axe invariable de u, 
ou Ion a 

Maintenant, il est bien connu qu'une transformation en u de cette forme 
puisse etre rrfduite 4 la forme 

ou u = q> (u) = m- f qy+, 1*^ 1 + ... d^signe une certaine fonction analytique 
4 l’origine (*)• En introduisant cette variable u au lieu de m, et une variable 
analogue F au lieu de t\ on obtient une transformation de la m&me forme ( 3 ) 

qui, de plus, se r^duit 4 u t =QU et r, = ^ v respectivement sur les deux 

axes invariantes. Cela nous montre que dans ces variables, Pp(w.O) et Qj»(v, 0 ) 


(«) Cette fonction tp (u) est dtterminte ividemment par liquation fonctionnclle: 

« P (-(1 + Pa («.0 )> = **<«> • 

Pour la demonstration qu’une telle fonction existe toujours, voir la Note 1 de M. Picard dans 
le livre des MM. Picard et Simart, Thtorie des /auctions al^tbriques de deux variables indi- 
pend antes, t. 2. 


592 



Q. D. BIRKHOFF 


148 

s'evanouissent identiquement. Par consequent, nous pouvons prendre T dans 
une des deux* formes suivantes, 


m, = Q ue c *‘ ,4 (\ +v IV-• K *0) * 
m, = o «(i 4-*’ *0)« 


r, = y H (1 -f n Q --1 («> »)) , 

p = -t-(l -fuQa-i (m, *»))« 

P 


of, IV -1 et Of-I sont analytiques 4 lorigme et oil leurs developpements en 
sdrie commencent avec des termes de degre au moins p — 1 . 

Introduisons maintenant dautres variables «. f, en ecr.vant 

5«»(l +f,W , i=«’(l + »«(")) 

oi. L et sont analytiques en e et « respectivement pour u = 0 et v = 0, 
avec deveioppements en sdrie qui commencent avec des termes de degre |> 
au moins. Les equations de transformation resteront de la mSme forme 

5,— *5^ s,;, (l + v Pe-i (5, ?)). e, - y w- c “ ,r, (* + « 0.-1 K *)). 

comme montre un calcul immediat. Substituons maintenant A la gauche dans 
la premiere Equation 

(i+/|, (»,))“# v*‘ m ‘ '' (l+rlV-i(«.«0> (« +l*(l C +“0e-> («.')))) 

Evidemment le coefficient complet de la premiere puissance de u ici est 
pr^cisdment 

<?(' + »iv-i (o.e)) (l -K»(f))- 


D’autre part en remplagant u et v k la droite par leurs valeurs explicites 
en u et v, le meme coefficient est donn£ par 

p(l+ 4 (r))(H-HVi( 0 ^)) • 

En comparant nous obtenons liquation 

(I + r P^-i (0, p)) <p = (1 -f- r V* (°> ”)) <P (*) 


593 



NOUVELLBS RECHERCHES SDR LBS 8TSTEMBS DYNAMIQDES 


149 


ou <p (») = 1 +/]u-i (*»)• Mais le produit infini 

nC+^Hl 0 ^)) 

convergent vers une fonction <p(t>) qui satisfait k liquation 

(1 -f- v P^-i (0, v) ) cp ^ = 9 (t>) 

et qui est de la forme admise pour 1 + f ^ ( v ). La fonction P^-i (0, v ) se r^duit 
k 0 en ce cas. De la mfime manure par un choix analogue de (u) nous 
pouvons r^duire Q,*-i (u, 0 ) k 0 aussi. 

Par consequent en ces variables nous aurons 

(4) u , = eI «'“ , ’ |, (l + i«.P 1 *-j(B,p)), «’■— -J- («.») 

( 4 -) Ul = eU (l 

L'importance de cette forme des equations et des formes analogues se 
montrera dans ce qui suit. 

Dans le Mdinoire cite au-dessus je n'ai pas consider le voisinage du 
point fixe en detail; en effet je me suis contente de demontrer seulement 
que tout point en dehors des courbes invariantes hypercontinues doit sortir 
d’un voisinage donne de ce point, soit par la repetition indefinie de T ou 
de T" 4 . Ce fait est evident dans le cas hyperbolique general, comme nous 
venons de voir. Mais dans une breve Note posterieure (') j’ai remarque in- 
cidemment que l'existence d’un nouveau type de solutions periodiques de¬ 
pend d'une autre pro^riete assez vraisemblable des solutions voisines dans 
le cas hyperbolique general. Cette propriete joue un r61e fondamental dans 
ce qui suit. Nous la demontrons plus tard, mSme pour les cas hyperboliques 
spdciaux plus compliques. 

En effet dans les cas generaux ( 2 ) et ( 2 ') la transformation normale laisse 
la famille des hyperboles uv = const, invariante. II me semblait k premiere 
vue assez probable qu’il existait une famille invariable actuelle analogue pour 
la transformation T, et dans ma Note j’ai fait cette hypothese sans la justi¬ 
fies Un de mes sieves M. le Dr. C. B. Morrey a entrepris la consideration 


(•) On the Periodic Motions of Dynamical Systeme. Ada Mathematica, t. 48 (1927). 
Serie III, Vol. I. 


594 




150 


O. D. B1RKHOFF 


de cette question el des autres questions analogues dans sa these intituWe 
Invariant Functions of Surface Transformation (Harvard. i 9 3>) en s appuyant 
sur la forme quasi-normale (3) de la transformation T; ses rdsultats mtdres- 
sants ne sont pas encore publics. Mais son tfcude semblau montrer qu on 
ne peut guere espdrer avec l’emploi de (3) seulement d dtablir la propridtd 
dont j’avais besoin. 

NEanmoins je dEmontre ici que la conclusion dEsirEe subs.ste, au moms 
dans le cas d’une transtormation conservative qui nous intEresse- Notre demon¬ 
stration de ce fait dEpendra de lemploi du Lemme suivant, dont les Equa¬ 
tions (5), (5') sont plus pEnEtrantes que celles employEes par le Dr Morrey. 

Lemme. Par une transformation convenable des variables u, t\ on peut 
toujours Ecrire la transformation T dans une des deux formes suivantes: 

( S ) „ ue -(«.»)). 7 (*••)>• 

(5 -) „. = «?«(> «rH(«,«)). = 7fO+^^Q(“.«))• 

Dans ces Equations l'entier p est arbitrairement grand, et P et 0 sont ana- 
lytique k l'origine. 

Pour le dEmontrer je vais montrer en premier lieu que la sErie inva- 
riante (') 

F* (ti, t>) = uv log q (1 -f- ^ ^ ®") 

est telle que toutes les sEries 2 <p«. u" et 2 q>«. tT convergent. Cela veut dire 
qu’on puisse Ecrire par exemple 

F* («, v) = uv log q (1 4- <p, («) v + tp t (a) v* + • • •) » 

ou les fonctions <p n H sont analytiques k l'origine. J'ai dEmontrE ce fait en 
passant dans mon MEmoire moyennant des sEries majorantes. A cause de 
sa haute importance thEorique je vais ici en donner une autre dEmonstra 
tion independante et plus simple. 


(«) La sirie invariants F* doit admettre vc comme facteur puisque u = 0 et u = 0 correspon¬ 
dent aux courbes invariantes. 


595 



NODVELLBS RBCHERCHF.S SDR LRS SYSTKMES DYNAMIQUBS 


151 


Commengons par d*montrer que la s*rie cp, (»<) doit *tre convergente 
Puisque F* est invariante, nous aurons 

«\ (1 + <P. (“.) •* + . . . ) »= «w»(1 + f*. («*) «> + • • •) • 

En substituant les series (4) ou (4') pour u t et v lt et en comparant les coef- 
ficients en v des deux cot*s, nous obtenons une identity. La comparaison 
des termes en t>* nous donne l iquation suivante, 

-i- cp, ( qu ) — <p, (u) = — u (P ^2 («, 0) + t.V -2 (u, 0)) , 
qui admet la solution explicite en s*rie 

<p, “ 2 75 0 V-* ( t*' °) + Q **- 2 (°)) * 

Cette solution est analytique 4 Forigine et son d*veloppement en s*rie com¬ 
mence par un terme de degr* p — 1 au moins. Mais, il n y a qu une seule 
solution en s*rie formelle, un terme lin*aire arbitraire except*. Done le 
coefficient <p, (u) doit *trc analytique comme nous le voulons d*montrer. 

De m*me maniere la comparaison des termes en v a nous donne une 
Equation analogue 

J- cp, (pu) — cp, (m) = A (“) 

oil f t est analytique 2 i l'origine 4 cause de la forme de <p, (u) d*j* *tablie. 
En qontinuant ainsi nous d*duisons sans difficult* que <p, (“). <P« • • • sont 

analytiques. 

La m*me maniere de raisonner s'applique aux coefficients analogues 
ViW.V,W.*» de ,a s ^ rie arrang*e en puissances de u % et nous montre que 
ces fonctions sont *galement analytiques. 

Avec l'aide de ce r*sultat nous pouvons d*montrer le Lemme. Partons 
des variables m, v pour lesquelles T a la forme (4) (*)• En remplagant p par 
2 p nous aurons pour de telles variables 

u . = e* ue“-' •* (l + hcP-V-s (■• °))> l * = 7. ** ( 1 + “"CV-* (“•«)) 


(i) Le cas (4') est tout £ fait analogue et mime plus simple. 


596 



O. D. BIRKHOFF 


quelle que soil k. En prenant la ddrivde formelle par rapport & k pour * = » 0 ) 
nous obtenons 


( 6 ) 


6 „ = -L 2 H = u log e (I + r+ •®P*2,.-2 
y ~*o 



^F* 

~&U 


— e log e (1 + f + “°0V-2 (“• < 0 ). 


Oil la fonction analytique Q qui entre est telle que ffadudu est l'invariant 
integral avec Q = 1 pour « = »=0 et oil r* = r/log q. De ces deux Equations 
on deduit en premier lieu 



= *• t>* R 2k _ 2 (u, v ) . 


Par consequent F* ne contient aucun terme qui nest pas divisible par u' v 9 , 
le* premier terme uv log p excepte: 

F* = UV log Q (1 -f **fR*2|4-2 (“»*0) • 

A cause dc la propriety des series partielles en F* que nous venons de 
demontrer, nous pouvons ecrire 

(7) F* = uv log e (1 + “*<*2,1-2 (“. *) 4 - # ^ H (*/, v)) 

ou la fonction analytique uvGu-2 (u, v) d^signe la partie de la s^rie F */uv log q 
autre que la constante / qui n’est pas divisible par ir“ Ecrivons maintenant 

“ — v = v{\ -f uvG 2tL _ 2 (k,v)) 

tout en remarquant que les equations precedentes sont valables mcme dans 
les variables nouvelles. En substituant ces variables, la serie F* (*/, t») en ces 
variables prend la forme 

(8) F* (w, r) = uv log q (I + ir u r“R (m, t*)) 


(•) La th^orie formelle employee ici se trouve dans S. T., Chap. 1. 


597 



N0UVBLLB8 RBCHBRCHBS 80R LBS ST8TEMES DYNAMIQOBS 


153 


ou nous mettons w, v au lieu de u, v pour simplifier le symbolisme. En rap- 
pelant les Equations (6) nous obtenons 

( 9 ) Q (“• ®) = i_|_ ]*„«,,* + “^2^2 ®) • 

Nous proposons maintenant de modifier Q (11, v ) par une suite de chan- 
cements des variables qui ne modifient pas les propri^t^s 66\k obtenues. 
Nous emploierons des changements de la forme suivante- 


(10) 

’ /ft 5 ) 

ou 


00 

/ft*)—(■»,») • 


Eo de telles variables les Equations (4), (5), (6), (8) et (k) restent valables. 
Ces changements des variables constituent un groupe. 

La fonction Q ( u, v ) en de telles variables est d^termin^e par liquation 
^vidente 


qui se r^duit ici k 


(12) Q— Q ( u .") (1 + u '°- gr ) • 

\ ^ U / 


Montrons maintenant par une suite des transformations de ce groupe qu’on 
peut faire disparaitre tous les termes de degr£ moins que p en u ou en v 
dans S 2fA _3, c'est k dire qu'on peut r^duire Q («, t>) k la forme 

(13) Q ("■<■)•= ! 4 (“.<’) • 


Faisons disparaitre en premier lieu les termes de deuxieme degr£ en u ou 
en v dans 0 {voir (9)), en ^crivant 


0 («♦ «0 = I + uv ( ue 2 p -3 M + »/*p-3 (“) ) + •“•• $2^-4 ®) 


598 




154 


O- D. BIRKHOFF 


pour fixer I’attention sur ces termes. Pour cela choisissons f telle que 

log f = ltv(ug 2tt _ 3 (v) + *2,1-3 (“))» 

et exprimons Q (u, v) en termes des variables nouvelles correspondantes, ce 
qui nous donne 

Q(i/,t>) =——L— r +i»(«*fc~3(*)+ ”/*-*(“)) + (“••’) • 

i -j- C*U V 

D’autre part nous obtenons 

- ^ log J_ — y * -L = uv(uy 2 u.-$(,'>) — vk ip-3( u ) + “*• ,t 2|»-4(“» w ) • 

-iu tv 

En regardant liquation (12) nous voyons que les termes que nous consid^rons 
disparaitront si les deux fonctions 0 2f4 -j(*O et A 2 u_ 3 (u) sont te,,es < l ue 

Par consequent nous pouvons ^crire, apres une telle transformation, 

Q<■• ■)- \+\w 7 + •* S2 --‘<"• •> • 

Pour aller plus loin ecrivons 

0 v ) — t + “* v * (“* 2 , 1-5 (*) 4- ®/^-s (“)) 4- “* •»* s 2^-6 (“» y ) « 

et choisissons la fonction f analogue 

lo frf =u , v*(ug 2}L .s(v)-\-vh2 lt - 5 (u)) . 

Comme auparavant nous trouvons une fonction Q telle que 

0 = 1 +'c* u W + “* * ***<“• v ) * 


599 



NOOVELLES RBCHERCHES SDR LES SYSTEMES DYNAMIQUE9 


155 


En continuant ainsi nous parvenons sans difficulty a une forme de Q 
pour laquelle (13) a lieu, tout en retenant la forme (8) de F*. Par consy- 
quent les yquations (6) prennent la forme 

bu = u log q (1 -f c*u' * + if *> ®)) t 

( 6 ') 

61- = — V log q (1 4 - C*u l d -f if if E (u, v)) . 

En intygrant les yquations diffyrentielles tormelles correspondantes ( voir sec¬ 
tion 1, chapitre II) on obtient pour k = 1 les yquations (9) de notre Lemme. 


3. Le voisinage d’une solution A-pyriodique gynyrale. 

Apres ses pryiiminaires nous pouvons dymontrer le rysultat suivant, et 
ainsi iustifier dans le cas hyperbolique gynyral la conjecture de ma Note: 

Soil (»/, r) (its coordonnees dans le cas hyperbolique generate lel/es que les 
deux axes sonl les courbes invarianies tandis que les equations (5) ou (5 ) out 
lieu. En de telles coor,(ounces it existe une famitle de courbes invariantes de 
classe O h > (*) sauf an point fixe, qui remplissent tout le voisinage de ce point, de 
facon que ces courbes suivent les hyperboles uv = const, jusquh un ordre arbi • 
traire en le produit uv, tandis que liters inclinaisons dv/du sont donnies par — v/u 
jusquau mime ordre, et plus generafement d'v/du 1 ... d k v/du k sont dontties par 
des fortuities correspondantes. En particular, ces courbes invariantes s'appro- 
client des axes invariantes uniformbnent en dehors d'un voisinage quelconque 
du point fixe tandis que leurs inclinaisons sapprochent de celles des axes cor • 
respondantes (voir la figure ci -jointe). 



(«) Cette notation signifie que dans le voisinage de tout point particular liquation de la 
courbe peut «tre exprime dans la forme y = f(zj ou f. f,... /»» sont continues et x, y sont 
des coordonnees rectangulaires. 


600 



156 


o. 


D. BIRKHOFF 


Dans ce qui suit nous nous bornons au cas k = t qui seulement entre 
effectivement dans notre th^orie. L'extension du raisonnement au cas k > 1 
est immediate. 

Introduisons des variables U et V ddfinies par les Equations 

u = c-v, V = tf- v , 

pour u > 0 , v > 0 . En ces variables la transformation T (prise dans la forme (5)) 
a la forme (‘) 

( U, = U + log q -f- f*- /<0+v » + < u + v > <t> {e- u , er v ), 

(14) ( c>0) 

/ V, — V — log 0 — re-'< u + v > -f *-*< u + v > ^(*-u e - v ) , 

011 < 1 > et V sont analytiques pour u =t>=* 0. Le voisinage de 

I’origine (* grande) correspond k la region I* du plan des variables U, V 
pour laquclle U £ V ^ k. Ecrivons de plus W — U 4 - V. Nous obtenons 
imm^diatement 

(15) W, = W 4 -r*‘ w x(r u ,r v ) . 

ou x est de la meme nature que <I> et 4*. 

Mais pour le nombre positif k suffisamment grand les Equations (14) nous 
montrent que I'indgalitd suivante a lieu, 

U, > U 4 - y log q , V, < V — -i- log q . 


II s’ensuit que le point P sort de la region T* apres moins de 2 W/log q 
iterations de T ou de T*‘. 

Mais on a partout dans T* 

|W. — Wl^r^x , 

ou x d^signe une borne sup6rieure de la fonction x(*~ u , e~ v ). Cette inegalite 
nous assure que W, ne croit pas plus rapidement avec k que la fonction 
W(n)(\V( 0 )*W) definie par liquation differentielle 

dW — j*w 

j.. — e x » 


(') Lc raisonnement dans le cas d'une forme normale (5') est le mime. 


601 


NODVELLES RBCHERCH8S SDR LBS 8YSTEME3 DYNAMIQDES 


157 


et ne decroit pas plus rapidement que la fonction W (n) (W (0) = W) definie 
par liquation 

«iW 


dn 


= — e x . 


Mais ces Equations nous donnent respectivement: 


- e“ w = pxn 


— — put 


Done pendant n< 2 W/logp iterations nous aurons certainement 


et ainsi 

(16) | W. — W | < f-“ w WQ (fc-I, 

ou Q est une constante numerique. 

II parait done que W reste constante k peu pres pendant Titration de 
T ou de T* 1 jusqu’au moment ou le point (U«, V„) sort de T*. Mais il s’ensuit 
des Equations (14) et de (16) que les inegalites suivantes ont lieu aussi: 

| U„ — U — n lo([ p — n«' w 

(' 7 ) 

| V. — V + n lo K p — | £ WO* 


ou Q* est aussi une constante numerique. 

Considerons maintenant la ligne 11 = V du plan des variables U, V. Cha- 
que point (tl„ V # ) de cette ligne est transforme en un point U„ V, ou les 
coordonnees U„ V, sont k peu pres egales aux 

U # + log Q + rf“ / ' u -+ v «J et U # — log p — re“ /(U *+ v j 

respectivement selon les equations (14). Les erreurs correspondantes n’exce- 
dent pas la quantite fl*. Construisons le segment de la ligne qui 

joint (U 0 , V 0 ) k (U„ V,). C'est un segment de longueur k peu pres egale k 
^2 log p qui fait un angle k peu pres —n /4 avec l’axe positif de U. L'image 
de cette ligne sera Fare d'une courbe analytique qui joint (U„ V,) k (U„ V t ) 


602 



158 


O. D. BIRKHOFF 


dont la ligne tangente fait partout un angle de — n/\ k peu pres avec 1’axe 
de U selon les formules explicites (14). 

Remplagons maintenant le segment lin^aire entre (U, V) et (U„ V,) par 
un arc r^gulier qui joint ces deux points et qui differe peu de cette ligne 
en position et inclinaison, mais qui est tangent k l’image de cette ligne en 



le point (U„ V,) (voir la figure ci-jointe). En particulier on peut prendre 
pour liquation de cette courbe 

v-v. = U^:<u-u.) + 

(u. = v„). 



Cette Equation repr^sente la parabole de troisieme degre en U qui passent 
par (U«, V 0 ) et (U,, V,) de fagon que les directions des tangentes aux points 
(U 0 , V 0 ) et (U,, V,) sont celles du segment lin^aire et de son image respecti- 
vement. En substituant les expressions explicites en termes du parametre U 0 
cette Equation prend la forme suivante 


V = 2 U 0 — U + *-** u «> (U - U). | a (*- 2U «) + (U — U 0 ) + y (*" 2U *>) (U — U 0 )* \ 

ou a, p, y sont analytiques en e~ 2U 

Par consequent les arcs de ces paraboles auxiliaires joignant tout point 
(U 0 V 0 ) a (U„ V ( ) constituent une famille analytique des arcs qui remplissent 
complement la region fondamentale entre la ligne V = U et son image 
par T, de fapon que chaque point P de cette region se trouve sur une courbe 
de cette espece et sur une seule. En outre les courbes formees par ces arcs 


603 



WOOVELLBS RECHERCHES SOR LES SYSTEMES DYNAMIQUBS 


159 


et ’eurs images successives sont de la classe C , ' > et restent partout pres de la 
ligne U-f- V = 2 U 0 ; plus exactement la distance entre la courbe et la ligne 
n’excede pas un multiple fixe de On voit done que la famille des 

courbes ainsi obtenue rassemble la famille U ^ = const, tres etroitement 
k une grande distance de Torigine. 

Ce raisonnement montre aussi qu 'k chaque point P = (U, V) de T* cor¬ 
respond une valeur particultere de U 0 ou (U 0 , U 0 ) d^signe le point de rencontre 
de la courbe invariante contenant (U, V) et la ligne V = U. 

Etudions maintenant I mclinaison de cette courbe invariante. Evidemment 
Tinclinaison de Pare fondamental diff^re de — 1 par une quantity de lordre 
de e~ 2 ‘* v °. Pour discuter cette inclinaison le long de cette courbe nous com- 
mengons par les Equations suivantes deduites de (14) : 

l dU, = d(U + c*-'««+ v ») + (F„ dU + F„ d\) 

(18) 1 

dV, = d(V - rr^vi) + ^-K(u+v, ( p n m 4. F „ dV) 

ou les F ,J sont analytiques en u=-e~ v % et s’annulent a l'origine. De 

ces Equations (18) nous obtenons 

dV dU, - dU dV, - (f/VJ + d \y 4. # hmu+v> (G„ dU* + 

° 9) + *G„rfUdV + G„dV) 

ou les fonctions G</ sont du mfime type que les F</. Designons par 9 Tangle 
entre la tangente k une courbe auxiliaire et la direction de la ligne 
V = — U + const, qui passe par le m£me point de la courbe; et designons 
par ds lament de longueur de cette courbe. Liquation (19) peut s’ecrire 
de la maniere suivante, 

(20) sin (8, - 8) - [ 4 «-'< u + v > sin* 8 + e-i*> u - v < A (*-*>, e ~' v , 8)] ^ , 

ou la fonction A est bomde et periodique de p^riode 2 * en 0 et ds/ds t est 
born^e. 

Par consequent Tindgalite suivante doit avoir lieu le long d’une telle 
courbe invariante dans la region considered 

(21) K-a|<K(*-' w a’ + e-.“ w ) . 


604 



O. D. BIRKHOFF 


10V 

En faisant la comparaison il parait done que | 0 | croit moins rapidement 
avec (’iteration que la solution de l iquation diff^rentielle suivante : 


-f n = K (e-' w ° d„* + «-!*«.) 


Souvcnons-nous aussi du fait que la valeur initiate de 0 0 le long de 1 ‘arc 
fondamental n'excede pas K*- a ' 4U «» en valeur absolue. II s’ensuit que 0 n croit 
moins rapidement avec (‘iteration que la fonction 0 (n) pour laquelle 

= 2K (e- v * d + ■ & ( 0 ) = 3 Kr-* u « . 


Mais en partant de la condition initiate d ( 0 ) = 0 il est Evident que 
0 (I) — Par consequent 0 (Ar) n'excedera pas la solution de 

cette meme Equation differentielle qui satisfait k la condition auxiliaire 
0* ( 0 ) -a 0 . En integrant nous trouvons la solution explicite. 


a*(n) = 


tKe-w* n 

1 — *Ke-‘»H-«iu 0 f| 


Nous voyons ainsi que d*(»» 4 -l) et par consequent d (n) n'excede pas un 
multiple fixe de e-Wo U 0 dans T*. 

Nous obtenons de cette manure la conclusion enoncee pour kmm\. Le 
cas k * 2 pent £tre discute en s'appuyant sur ce rdsultat pour k= 1 . En 
effet on sait maintenant que dV/dU est k peu pres egale k — 1 , le long d'une 
telle courbe invariante; et que d'V/dU* est tres petite au moins sur Pare fon¬ 
damental. Par Pemploi des equation (14) on peut obtenir d , V 1 /dU l * en termes 
de U, V, dV/dU, d , V/dU* et de cette maniere on peut demontrer sans difficulte 
que d'V/dU* restera petite. Des considerations analogues s'appliquent aux 
cas k = 3 , 4 , ... aussi. De cette maniere nous obtenons, la conclusion enoncee 
pour k = 2, 3 ,... aussi. 


4. Extension aux autres cas hyperboliques. 


Le resultat precedent suffit pour nos buts theoriques dans le cas hyper- 
bolique general. Nous allons traiter les autres cas speciaux afin de comple¬ 
ter la theorie. 


605 



NOOVELLBS RECHERCHBS BUR LBS SY8TEMES DYNAMIQOBS161 


En tout autre cas d'un point fixe hyerbolique (‘), soil F* la sirie invariante 
fondamentale et F (»/, v) un polynorne qui coincide avec F* (u, v) jusquaux lermes 
de degre suffisamment lUvi. II exist era alors urn familU des courbes inva- 
riantes qui couvre tout le voisinage du point fixe (sauf Us points situis sur 
Us courbes invarianies hyper continues passant par U point fixe lui-meme) de 
sorte que chaque courbe suit urn courbe F = const, partout jusquh l'or dr e jx 
(p arbitrage) (’). {Voir la figure ci-jointe). 



Remarquons en premier lieu que la mahode employee au-dessus doit 
6 tre conskterablement modifi^e, puisque dans ces cas sp^ciaux il n'y a plus 
aucune forme normale. En effet, il faut maintenant tenir compte dun nombre 
infini d'invariants au lieu d un seul. 



(•) Cependant nous laissons partiellement de c6«fc le cas hyperbolique oil il y a des courbes 
invariantes composies de points fixes. 

(*) Plus exactement cela signifie pour dous: (1) il existe une courbe F=C telle que la 
distance d’un point de cette courbe 4 un point correspondant de la courbe invariante est moins 
que Kr ,u ; (2) Tangle entre les tangentes correspondantes est aussi moindre que Kr^; (3) la 
direction des tangentes tend vers la direction dea courbes invariantes hypercontinues le long 
de ces courbes. 


606 



D. BIRKHOFF 


162 

Soit done OL et OM deux arcs hypercontinus adjacents parmi les cour- 
bes invariantes hypercontinues. Ces courbes peuvent £tre tangentes 1 une 
a l’autre. Mais, k cause de leurs d^veloppements asymptotiques differents, 
on peut insurer une courbe analytique auxiliaire ON entre elles (voir la fi¬ 
gure ci-iointe). 

Si 01 - et OM ne sont pas tangentes l'une k l’autre en 0 , on peut prendre 
ON comme une ligne ordinaire situ^e entre les deux lignes tangentes. Si OL 
et OM sont tangentes en le point 0 , les ddveloppements correspondants en 
s^rie peuvent Stre Merits 

v m cu a + . . . et v = du* -. (u, p 1), 

en prenant I’axe de u comme la direction de la tangente en commun. On 
peut alors <icrire liquation de ON de la maniere suivante, v = eu* + ..., ou 
cette sdrie contient tons les termes initiaux en commun des Equations de OL 
et OM, et un terme du moindre degr£ parmi les autres termes mais avec 
un coefficient entre les deux coefficients dans les deux series pour OL et OM. 
Bien entendu, un de ces coefficients peut se rdduire k ztro. 

L'image ON, de ON par T, aura un contact d’ordre fini avec ON puisque 
ON n est pas une des courbes invariantes, done ON, sera n^cessairement 
situd ou k la droite ou k la gauche de ON et nous pouvons parler de la 
region fondamentale entre ON et ON,. 

Remplagons maintenant la s^rie F* par un polynome F avec le meme 
d^veloppement que F* jusqu’i des termes de degr^ 6 \ev 6 p, tout en rete¬ 
nant la m£me fonction Q. Nous d^finissons ainsi une transformation T a peu 
pres la mime que T et qui peut Stre regards comme effectu^e en faisant 
suivre la transformation T par une autre, U, dont les dtiveloppements de 

u t _ u et t, — v commencent par des termes de degr£ p — I au moins. Par 

consequent, la region fondamentale de T sera presque la m&me que celle 
de T. En particular, la courbe ON sera transform^ par T tn une courbe 
qui a un contact avec ON, d'ordre p — 1 au moins en le point 0 . Cette 
transformation T est int^grable et les points P, T(P), T* (P),... sont tous si- 
tu^s sur la meme courbe invariante alg^brique F = C. 

En construisant maintenant les arcs alg^briques F = const, entre ON et 
ON, on trouve une famille analytique des arcs PQ qui joigi.ent chaque point 
de ON au point Q correspondant de ON, de fa^on que l image P,Q, de PQ 
coincide k peu pres avec la continuation analytique de PQ {voir la meme 
figure). On peut alors modifier les arcs PQ en les arcs PP, de fa^on que ces 
courbes suivent les courbes F = const, a 1 'ordre p pres de la region fon¬ 
damentale NON,. Les seules difficult^ qu’on rencontre ici sont d’une nature 


607 



NOUVELLES RECHERCHBS SUR LBS SYSTKMES DYNAMIQUE3 


103 


purement alg^brique. On peut se convaincre de ce lait en commengant par 
les cas les plus simples. 

On obtient ainsi une famille des courbes invariantes par T IT, P,... 
assez parcille k la famille alg^brique F = const. Nous dtimontrerons plus 
tard que cette famille dans sa totality couvre entierement la region I.OM, 
exception faite des courbes invariantes hypercontinues OL et 0M. 

Nous avons maintenant a suivre ces courbes PP,P,... au dela dc la rd- 
gion fondamentale NON, du c6t6 de ON, jusqu a une courbe 0 M* a I'intlrieur 
de N, 0 M et ayant un contact d'ordre arbitraire mais fixe avec 0 M; son pro 
longement PP.,l\ t ... dans la direction oppostSe peut etre 6tudi 6 de la menu; 
mani&re en remplagant T par T *. Le point P doit sortir de la region NOM* 
apres un nombre derations de T d'ordre d au plus en 1/F, ou d ne depend 
pas de p (‘). 

Consid^rons maintenant un point P et son image P, par la transforma¬ 
tion donnle T. II est Evident que dans la rdgion NOM*, l in^galit^ suivante 
subsiste pour p et v arbitrairement grands 

(22) |F,-F|^C^^DF\ 

De liquation differentielle assoctee 

£ = DF ' • 

nous pouvons condure sans difficult^ que | F/F c — I | reste d'un ordre arbi¬ 
trairement grand en F. (oil F 0 designe la valeur de F au point P) tout le long 
de PP, P, ... dans NOM*. II faut se souvenir ici du fait que le nombre de¬ 
rations n^cessaires de T et done de T n’excedent pas KF‘ d oil d est ind^pen- 
dant de v. Par consequent les courbes invariantes PP,P t ... suivent etroite- 
ment les courbes alg^briques correspondantes qui passent par le m£me point 
P, jusqu’au moment oil les courbes invariantes sortent de la region NOM*. 

(') La longueur des courbes algtbriques F = C entre ON et Oil* est finie, et done le 
nombre d'itirations de T, 

/*=-/*■ 

n’excede pas un multiple constant du maximum de l/^F^+F,*. Mais pour p grand. pK.'+P** 
c-st au moins aussi grand que LK oil 0 ne depend pas de p; et F et r sont d’ordres com¬ 
parables dans NOM". D’ou la conclusion ‘ndiqute. 


608 



164 


O. D. BIKKHOFF 


Nous pouvons maintenant ctemontrer que les courbes PP.P, ••• auront la 
mfime direction a peu pres que les courbes correspondantes F = C. En effet 
partons de 1'identitd 

Fw («,. »,) •lu t 4 - F„ (w t , t>.) </r, = F. («, v ) du + Fp (m, v ) dv -f 
(23) P (w, v ) du -f 11 («, u) du 

ou P et K sont des polynomes qui commencent par des termes de degre 
au moins p. Cette identity r^sulte immediatement de l'identitd formelle 
( lf* = dF*. Done si nous ecrivons 

Fp u' 4- F. _ 

S ' n ^ ” (/F.* + F.* fV* + 

ou ii # et u' sont les derivecs de u et u le long de la courbe invariante qui 
passe pas le point <p represented langle entre la direction de cette 

courbe invariante et de la courbe F = const, en ce point. 

Introduisons maintenant les deux variables 

W-F.-' + F.,. Z = Q • 


De l identite (23) nous obtenons 


W 


|S== xv4-Pi/4-Rd = (i 4- TfT^r-) w 4- 


PFp - RF. 
Q 


Par consequent linegalite suivante doit avoir lieu pour un choix convenable 
de C, 

, W| -W|^C(F^ | W|4-r |Z|) , 

ou « designe un nombre positif assez grand pour que 


yr .’+iv > c f* 

dans NO.M*. Pour obtenir une inegalite analogue pour I Z 4 — Z J, remarquons 
que le long d’une courbe formelle F* = const., on aura necessairement 

_ n Fp* du - F„* dv _ Fp , » il u i - l > *dr , 

_ v F #l K * t F *. + 


609 


NODVBLLES RBCHBRCHBS SDR LBS SYSTfeMBS DYNAMIQDB8 


166 


selon les Equations (5) du chapitre IL Mais ceia signifie qu’il existe une 
autre identity formelle du type suivant, 


0 , 


F,» du. — F,* do, 

F-.^ + P..** 


F* du — F *dv 


S* dF* 


e.' + F.' 1 + (F.** + F.**) (F. .** + F.,*’) ’ 


ou S'* est one sdrie formelle en les puissances de u et r. Cette nouvelle 
identity peut s'^crire aussi de la maniere suivante: 

(F « + F.**) (F.,** + F.,») (Z ,* - Z*) - S* W* . 

Ici W* et Z* sont ddfinies par rapport a F* de la mfime maniere que 
W et Z 1 'etaient par rapport 4 F. Ainsi nous devons avoir une relation 

(P„* + F.') (F„* + F.,*) (Z, — Z) = S W 4 - Pu‘ 4 - 15 V , 


oil S est un polynome qui commence avec les mimes termes que S* jusqu '4 
un degrd arbitrage, et ou V et H sont des polynomes qui commencent avec 
des termes de degrd p au moins. Par consequent nous obtenons la deuxicme 
indgalitd suivante pour p > a: 


|Z.-Z|^D(F- 4 -|W| 4 -F ! ‘-“|Z|) • 


De cette manure nous voyons que I W. | et | Z. | croissent moins ra- 
pidement avec n que la solution W(«), Z(") des equations differentielles 
suivantes, 

i* = C( rw4rz), | = D(r"w + r ,, z), 


ddfinie par les conditions auxiliaires 

W(0)-| W.l , Z (0) = IZ 0 I . 

Mais dans la region fondamentale NON,, W # est petit d’ordre p au 
moins en F; et, par consequent Z 0 doit Stre d’ordre — a au plus; nous 
supposons ici que t*' = cos*, r' = sin^ dans NON.. II sensuit done que nous 
pouvons remplacer les conditions au-dessus par les conditions suivantes, 

W ( 0 ) = KF* 1 , Z ( 0 ) = KF"* , 

et W (n), Z (n) croitront encore plus rapidement que Z». 

Scric HI. Vol. I. 12 


610 



o. D. B1RKH0FF 


166_ 

Ecrivons maintenant 

W (n) = W (n) , Z (n) = F* $ Z (n) . 

Les Equations et conditions transform^es seront alors 

-fg_C(F<~W + F*Z). + 

W ( 0 ) -= KK“ , Z ( 0 ) — KF 2 ■* . 

II s’ensuit que W et Z croissent moins rapidement avec n que pour la 
solution des Equations et des conditions suivantes: 


BF* ' 4l (W + l) 
Z (0) = KF^“ . 

Par consequent la fonction U - W Z d^finie par 

— EF^‘ 4 "U , 0(0) —8KF^’ 


^L-^-(W + l,. -g 


— Ji -• 

W( 0 ) — KF 2 


(E = C + D) , 


croitra plus rapidement encore; cette fonction est donn^e explicitement par 
l'equation 

U(n) = 2 KF^'* + EF^‘ 4 *U« , 


ou la valeur de n n’excide pas un certain multiple de F" p pour un choix 
convenable de p. Done,* pour n suffisamment grand, U (n) aussi bien que 




W(n) et Z(n) n’excedent pas un certain multiple de F 2 . Cela nous montre 

que W (n) est de cet ordre au plus et que Z(n) est de 1 ’ordre de F - * au plus. 

Revenons maintenant k l'equation pour dZ/dn , tout en employant ces 
faits. Nous trouvons imm^diatement l'in£galit£ suivante : 


dZ I 
dn 


_ Ji 
< DF 2 



611 



NOOVBLLBS RBCHERCHBS SDR LBS SYSTEMBS DYNAMIQDBS 


167 


Mais dans NON, Z( 0 ) ne differe pas beaucoup de (F„* -+■ F **) 2 « puisque 
nous avons u' = cos <p, v' = sin 9. L’indgalit^ pr£c£dente nous montre done 
que Z(n) reste k peu pres £gale k Z( 0 ) partout, pour p suffisamment grand. 
Par consequent, (F„ f + F*,*) tan 9 = W/Z reste petite dun ordre arbitrairement 
grand en F. Done tan 9 reste ^galement petite, comme nous le voulons 
demontrer. 

Dune maniere pareille on pourrait d^montrer que les d£riv£es succes- 
sives drp/ds, d , (D/ds , % . .. restent petites dans cette region. 

Evidemment on pourrait ^galement considlrer l'autre partie de la courbe 
invariante PP.,P- t ... . 

Nous obtenons ainsi le r^sultat partiel suivant: 

De finis sons uru region L*OM* b fmUrieur de LOM, de fafon que OL* et OM* 
son/ des courbes algebriques qnelconques dont f ordre de contact avec OL et OM 
respec/ivement, est arbitrairement grand mai fini ; alors il existera dans 
cette rig ion L*OM* une f amilie des courbes invariantes qui suivent Us courbes 
algibriques F = const, jusqub un ordre que/conque en position et en direction 
et mime dans ses dirivies tangentielUs (Cordre plus haul jusqu'b un ordre ar¬ 
bitrage k (*). 

II ne semble pas possible de suivre les courbes invariantes ... P.^PP,... 
plus loin dans les regions M* 0 M et LOL* sans faire emploi des particularity 
analytiques des courbes invariantes hypercontinues et de la fonction inva¬ 
riante F*. MSme dans le cas hyperbolique g^n^ral de telles particularity 
se sont montr^es n^cessaires dans le raisonnement. 

Partons du cas le plus simple parmi les autres cas hyperboliques, k 
savoir celui 011 F* commence avec un polynome du troisteme degr£ F, (u, 0) 
qui s’ 4 vanouit pour trois valeurs r^elles differentes de v/u. En ce cas il y a 
trois courbes invariantes k tangentes differentes qui passent par le point 
fixe 0 . Nous avons k consider le voisinage dune de ces courbes inva¬ 
riantes- La region k consid^rer est la region entre une certaine courbe OM* 
et cette courbe OM. Dans le voisinage de OM* nous connaissons que cette 
famille invariante suit la famille alg^brique F = const. tr£s dtroitement de la 
maniere que nous venons de specifier. Nous pouvons sup poser que la courbe 
0 M soit tangente a l’axe de u. Sa transformation T s’dcrit alors 

m, = u -f- au* 4- ibur + dev* . . . , (a > 0), 

0 , = 0 + * — 2 auv — bv * 


(') C’est seulement le rtsultat pour k — I dont nous faisons usage plus tard. 



612 


IQS 


O. D. BIOKHOFF 


avec 

F* (u, v) = * -f- au*V -f buv • 4 - cr* 4 * ... (nc — 46 * ^ 0 ). 

En ce cas la s^rie formelle pour OM aura la forme 

t, = b> u* 4 - &*+, u*+' 4 - . . . (ft > 1 ) 

oil I'entier ft est fixe mais arbitrairement grand. Le rlsultat precis de mon 
Mtfmoire en ce qui concerne la courbe invariante correspondante est le 
suivant: II existe urn courbe invariante hyper continue ON representee par les 
deux equations 

(25) u =* f (x) ~ x a t x* + ... % v = g (t) ~ b k x k 4~ ... 
pour x petit et SI (x) > 0 ('), de fa ( on que la fonction 

cd = -i- 4- a log 14 - P* + • • • •+■ xt * *■ ft W 

se trouve transformie precisement en a> — 1 par T, cest it dire a>, =» a) — I. 

En d’autres termes j'ai obtenu incidemment un param^tre normal t. 
Maintenant ce param^tre se montre d’importance thdorique comme nous 
allons le voir. 

Introduisons les coordonn^es ddfinies par (*) 

(26) “ =/'*(“)» ® — — 0 (/■'(“)) + * • 

Ces coordonn^es sont £videmment valables pour w, v petites et positives dans 
le plan r^el des variables h, i\ et elles font correspondre la courbe invariante 
hypercontinue OM k l’axe de u. En ces variables la s^rie invariante T*(n, v) 
peut Stre dcrite dans la forme 

T*(fi, v) = v[tp 0 (u) 4 - <P, (“)p” 4 - • • • 1 

( 2 7) _ _ _ _ • 

(<p 0 (w) = <p„ u* 4- <p 0J u* 4- . . . , q>, (u) = <p„ u 4 <p lt i/* 4-.....) . 


(•) Le symbole &(a) dtsigne la partie r4elle de a. Les fonctions f et g ici sont analytiquos 
pour R(t)>0 et de la classe C.^ pour u = 0. Au point de vue formel toutes nos transforma¬ 
tions doivent correspondre & des series formelles, pour que nous puissions appliquer la th^orie 
formelle. 

(*) La notation f~ K indique la fonction inverse. 


613 



NOOVBLLES RBCHBRCHB8 80R LES 8YSTEMBS DYNAMIQDE8 


169 


La transformation T aura la forme suivante en ces variables, 


( 28 ) 


u, = c(tt, o) ~ e 9 (u) -f- c,(u) v -f- . . . , 
v t =vd(u,v)~ p(rf c (“) + (“) 0 + • • • ) (*)». 


oil 

e «(“) = A-(A(!.)- ~1)~ ll + Ct ' ll * + •••’ e >M = e “ u + .. 

(^ 9 ) 

= + + . • 


Puisque F* est invariante, nous aurons l'identitl formelle 

r, (q>o (“.) + *,’(«.) •* + ...)" (■)+ 9. («)• + -••) • 

En substituant les valeurs de u, et r, au-dessus et en comparant les coeffi¬ 
cients des puissances de t\ sur les deux cdt&s on obtient un nombre 

infini des Equations dont la premiere est la suivante: 


( 30 ) <*.(“)*•(*.(«)) ■■*•(*) * 

Cette relation purement formelle peut Sire re^ardde comme une Equation 
fonctionnelle pour determiner <p # (u), puisque d 9 (u) et e t (u) repr^sentent des 
fonctions ordinaires; remarquons par exemple que de ce fait seulement nous 
concluons que la constante d 00 dans la s^rie pour d c (u) doit fetre £gale k un. 
Mais quand on connait one solution en s^rie formelle d'une telle Equation 
lindaire on peut en construire toujours une solution actuelle en ^crivant 


00 —. 


<P.(“) = 9 0 (“) 




O) — 


■<.(“-■) V. (“) 




<*.(“-.) <P. 


(•) Le symbole ro esl employ* lei dans le sens que e(u, r) et d{u.v) sont des fonctions de 
la classe C n , analytiques sauf pour u=0. et telles que les d*riv*es partielles par rapport a r ont 
les valeurs indiqutes le long de la courbe invariable v=0. 


614 



170 


O. D. B1RKHOFF 


ou (") est le po'y nome en “ <? ui donne les * premiers termes de la s^rie 
formelle pour <p 0 (u). Sous ces circonstances on a naturellement 

, — r -' + *(=-0 

do (“-*+•>) 

D’autre part la quantity ~u-* a \/w + k pour sa partie principle, et <P*(“) 
est hypercontinue et s'^vanouit pour u> = 0 . Done ce produit infini converge 
vers une fonction ? 0 (u) qu« sera analytique pour_« petit et .*(n).>0 , et qui 
possede & u = 0 la forme asymptotique de q> 0 (w)- En effet, on montre sans 
difficult^ qu'on obtient toujours la meme solution quel que soit lentier p, et 
cela nous montre que la fonction ainsi d^finie est asymptotiquement repr<<- 
sentde par la s^rie <?„(“)• 

Done le coefficient <p 0 (u) en F* repr^sente une fonction hypercontinue 
de u, comme nous I'avons dit. 

De la mtime maniere on concluj successivement que les coefficients 
••• dans ,a s * rie pour represented des fonctions hyper- 

continues. 

Par consequent la fonction 

(31) P ( tT) = «T (<p 0 ( M ) 4 - ... + V**) 

est presque invariaute, c'est a dire 

( 32 ) —*(«.•)+ ^ G (“»*') 

ou G(w,“v) tst une fonction du meme type que F(u,t>) ('). 

Voil^ le fait qui nous permet de suivre les courbes PP,P,... dans la 

region M* 0 M. _ 

En premier lieu observons que les courbes !•( w, t») = const, doivent 
suivre les courbes F (m, t») = obtenues auparavant.au moins dans le voisinage 
de OM* par exemple dans En effet le polynome F (m, v) employe 

dans MOM* est presque invariant le long des courbes invariantes et admet 
le meme d^veloppement que F*(t/, v) k 1 'origine jusqu ’4 un ordre arbitraire. 


(») Plus exactement, G(u,v) est analytique dans M*0N saul 4 1'origine, et admet des 
d6riv£es hypercontinues de tout ordre par rapport a v le long de v = 0. 


615 



NOOVELLES RECHERCHBS SDR LBS SYSTEMES DYNAMIQDES 


171 


En modifiant les variables de u, v kl,\ F(u,t>) devient une fonction essen- 
tiellement de la m£me cspece que F(m,o) et avec la meme forme asympto- 
tique dans M*OM,* jusqu '4 l’ordre correspondant. Par consequent les courbes 
invariantes suivront k peu pres les courbes P(u, t>) = const, dans M*OM t *. 
On se souvient du fait que le changement des variables de m, v en u % v se 
fait en employant des fonctions de la classe C* partout. Par j:ons 4 quent les 
courbes HP,... modifies suivront les courbes transformees de F(* 7 , tT) = const 
jusqu '4 un ordre quelconque dans M*OM,. 

Mais nous pouvons maintenant employer presque le m£me raisonnement 
quauparavant. Observons en premier lieu que selon (26) u k est k peu pres 
egale k u/( 1 - ku) le tong de OM. Cela rend Evident le fait qu'un point P 
de OM d^ passer a le bord dun voisinage fixe de 1 'origine avant que yfu 
iterations de T soient accomplies, u d^signant la coordonn^e initiate du point 
et y 4 tant une constante. Mais les formules (24) n^cessitent que m&me en 
partant dun point P dans M*OM t le mfime rdsultat subsistera. En effet 
dans cette region v est tres petite par rapport k u. Par 1 ' iteration de T, t> 
ddcroit constamment selon les Equations (24), (25) tandis que u crolt de la 
mdme maniere que sur la courbe OM elle-m£me. Done P d^passera le bord 
du voisinage fixe apres un nombre y/u d'iterations de 1 au plus. 

De plus liquation (31) nous montre que Tinlgalitl suivante a lieu: 

I e. - 1 1 <; k5‘ +i 

et par consequent que la valeur de F changera par yKv , * +, /“ au plus pendant 
de telles iterations successives; u et v ici designent les valeurs de ces varia¬ 
bles au point P dans la region M*OM. Mais la valeur initiale de F est du 
mfeme ordre de grandeur que u'v. Done f restera k peu pres constante 
puisque yKd , u+I /“ est beaucoup plus petite que u'v. Plus exactement le rap¬ 
port de ces deux quantites est de l'ordre au moins p — 3 en r puisque t> est 
petite par rapport k u. Done F restera constante jusqu '4 un ordre donn 6 
d’avance en v. 

Si maintenant nous introduisons la quantity 

sin cp — — ~ • - .. 

|/f.' + ?.* j/u" + i- 

comme auparavant et raisonnons de la mSme maniere encore, nous pouvons 
conclure que Tangle 9, entre la continuation des courbes invariantes et les 


616 



172 


D. BIRKHOFF 


O. 

courbes F = const, reste petit jusqu'au moment oil la courbe P, P„ ... sort 
d'un voisinage donnd de l’origine dans M*OM. 

Jusqu’ici nous n'avons traits que le cas special ou les courbes inva- 
riantes formelles consistent en trois courbes simples avec des directions diffe¬ 
rences des trois tangentes k 1 'origine. Ndanmoins la mdthode employee se 
r^vele immddiatement comme tout k fait gdndrale, exception faite du cas 
oil il y a une courbe analytique des points fixes qui contient le point fixe 
donnd; en effet il y a toujours un tel parametre normal le long de chaque 
courbe invariante hypercontinue (*). 

II me semble certain que, mime le cas exceptd d'une famille analytique 
des points fixes peut dtre traits d'une maniere pareille. Cependant dans 
ce Mdmoire-ci d 6 jk trds long, je ne I'entreprendrai pas. 


5. Application k la continuation analytique. 

Les rdsultats obtenus dans ce chapitre et le chapitre prdeddent permettent 
une application k la continuation analytique des solutions pdriodiques. Nous 
nous limitons ici k des systdmes dynamiques de deux degrds de libertd dans 
leur forme rdduite (n = 3 ) qui contiennent un parametre X analytiquement. 
De plus nous supposons quit tiexist* pas fits families analytiques de solutions 
periodiques mais seulement des solutions periodiques isoties pour toute valeur 
de X et nous considdrons une surface de section correspondante (locale ou non). 

Les points fixes de la transformation T correspondante ont des coor- 
donndes ( u , t>) qui satisfont aux dquations u, — m ■■ 0 , r, — v ■= 0 . Ces coor- 
donndes seront des fonctions analytiques de X pourvu que le ddterminant 
fonctionnel ne s'annule pas: 


| -»u ' 

Mais au point fixe on a toujours 

Jr, _ yi L jv, 


-)v 

'dv 


— 1 


=£0 


(•) Voir S. T., Chap. 2. 


617 




NODVBLLKS RBCHBRCHES SDR LBS 8YSTEMB3 DYNAMIQUBS 


173 


done cette condition se reduit k 



2^0 . 


D’autre part liquation caracteristique s’ecrit 



Par consequent urn solution plriodique quelconque varie analytiquement avec 
X sans co'incider avec aucune autre solution periodique de ce genre, au moins 
jusquau moment oil Us racines p, 1/p de Equation caracUristique deviennent 
toutes les deux igales & C unite (points de coincidence). 

Consid£rons brtevement une des branches analytiques d'une solution 
periodique entre deux points de coincidence; evidemment les branches ne 
peuvent pas disparaitre quand X varie de cette maniere. Pendant cette va¬ 
riation l'indice ne peut pas changer. 

Classifions les types diverses d’une telle branche partielle. Observoos 
que le long de la branche q et 1 /p varient d'une maniere analytique avec 
•X ou restent constantes. 

Supposons en premier lieu que q et 1 /p varient avec X. Pour p et 1/p 
rdelles et positives (mais ^ 1) l’indice est toujours —1 (cas hyperbolique 
gdndral); pour p negative ou imaginaire l’indice est toujours +1. Done 
quand X varie entre deux points de coincidence, ou la solution periodique 
reste du type hyperbolique general, ou elle est altemativement du type 
elliptique general et du type hyperbolique general avec echange des bran¬ 
ches avec p = — 1 aux points de separation 

D’autre part si p et 1/p sont des constantes autre que +1, la branche 
ne peut jamais avoir des points de coincidence avec des autres branches. 
En ce cas nous aurons les possibility suivantes: (i) Si p et 1/p sont r^elles 
mais pas igales k — 1, la solution periodique reste partout du type hyper¬ 
bolique general, avec echange des branches si p et 1/p sont negatives. (2) Si 
p et 1/p sont imaginaires avec p* 1, k = 2 , 3 ,... la solution periodique reste 
partout du type elliptique general. (3) Si p et 1/p sont telles que p* = 1, la 
solution periodique peut £tre altemativement du type elliptique ou du type 
hyperbolique avec echange des suites de k branches entre elles, 

En dernier lieu remarquons que sur une branche pour laquelle on a p = 1 
identiquement, l’indice sera -f-1 si la solution periodique est du type elliptique 
et —(a — 1) si elle est du type hyperbolique avec a branches. Done entre 
les points de coincidence une telle solution periodique reste du m&me type. 


618 



174 


O. D. BIRKHOFF 


Avec ces pr^liminaires nous pouvons consicterer les points de cofnd- 
dence p = l de deux ou de plusieurs branches, en supposant (par exemple) 
que X tend vers 1 de lau-dessus. 

Soit P,... P* les k points hyperboliques de T qui tendent vers le point 
P#» en poss^dant a l( ... <u fixes branches asymptotiques respectivement. 
Soit Q,,...Q« l es m points elliptiques de T qui tendent vers le m£me point. 
Les indices des P/ et 0 / sont alors — (a. — 1 ) et + 1 respectivement comme 
montrent nos r<isultats au-dessus. 

Si maintenant tons ces points coincident en P 0 pour X = 1 , l'indice total 
de ce point de coincidence sera n^cessairement 

- | a, + k + m , 


d’ou le r^sultat suivant: 

Si k solutions hyperboliques avec a/, ... a* branches asymptotiques indivi- 
duellement invariantes par T, et m solutions elliptiques coincident quand le pa - 
rami Ire X s'approche de X 0 , cette solution periodique correspondra h un point 
fixe hyperbolique avec — P 1 branches asymptotiques individuellement inva¬ 
riantes avec q -= 1 si f entier 


est negatif ou 0. Autrement 0 doit etre egal -f -1 et le point fixe correspon- 
dant est du type elliptique. Ce sont les seals cas possibles de coincidence , pourvu 
que le cas a'une courbe analytique des points fixes soit exclu. 


6. Quelques remarques g£n£r&les. 

II est bien naturel de supposer que 1 'effet d'une transformation conser¬ 
vative T dans le voisinage immddiat d'un point fixe hyperbolique rassemble 
dtroitement celui d’un groupe continu k deux dimensions dont la transformation 
infinit^simale est du meme type. Plus prdcis^ment, on pourrait croire qu'en 
remplacant la s^rie invariante formelle F* par n'importe quel polynome F qui 
coincide avec F* jusqu '4 des termes de degr£ suffisamment £lev£, et en d^fi 
nissant une. transformation actuelle T au moyen des Equations 

du 1 ^F dv t 

dk - Q -iv • dk ~ 0 ’ 


619 


NOUVELLES RBCHERCHBS SDR LES SYSTEMBS DYNAMIQUES 


175 


ou o est la fonction qui entre dans 1’invariant integral J J 0 du dv, la transfor¬ 
mation T se montre equivalente k T, au moins si Ton admet les transformations 
de la classe C*. ou Vernier k est arbitrairement grand. Je n'ai pas encore 
pu d^montrer cette conjecture. 

II est vrai que les r^sultats obtenus au-dessus nous permettent de con- 
struire des fonctions actuellement invariantes, mais Vdtude de leurs propriety 
demandera des m^thodes nouvelles. II faut chercher k construire une telle 
fonction invariante de la classe G» ( k arbitraire), si une telle fonction existe. 

Dans le cas pfaffien a n = im > 2 dimensions, il semble presque certain 
qu’il existe une configuration hypercontinue correspondant a chaque collection 
des series formelles qui donne une vari£t6 asymptotique invariante au sens 
formel. A vrai dire, les m^thodes de mon M^moire cit£ plus haut paraissent 
s’^tendre a ce cas. Mais Vdtude locale d’un point fixe telle que nous l’avons 
faite ici pour le cas de deux dimensions, semble presenter des difficulty con¬ 
siderables. 

Si, plus g£n£ralement encore, on part d'un systeme differentiel non-pfaffien 
d’ordre quelconque n, on en obtient imm^diatement une transformation T 
correspondante k n — t dimensions comme nous l’avons vu, et meme un 
systeme differentiel formel associe. Mais les moyens analytiques que nous 
avons employes, ne semblent suffir pour etablir une relation egalement etroite 
entre la transformation T et les solutions de ce systeme formel. 


CHAPITRE IV. 

Le voisinage 6tendu des solutions A-p6riodiques. 

i. Quelques solutions perlodlques dans le voisinage etendu. 

Pendant longtemps j'ai cru qu'il existe une profonde analogic entre les 
solutions ri-periodique (elliptiques instables) et les solutions A-p£riodiques 
(hyperboliques). C’est par Vintroduction de Videe du « voisinage Etendu » ou 
d'une solution A-p^riodique ou du point fixe correspondant que je fais 
maintenant ressortir plus clairement cette analogic. 

Le voisinage Etendu d'un point fixe hyperbolique n’est que le voisinage 
ordinaire de deux branches a et u> asymptotiques adjacentes, a savoir P«Q 
et QwP, qui se rencontrent en le point fixe P et en un point homocline Q, 
sans que ces branches coincident du point de vue analytique (voir la figure 
ci-jointe). Nous parlerons toujours du point P comme d’un point hyperbo- 


620 



176 


o 


D. 


BIRKHOFF 


lique du type g4n*ral. Mais les nfeultats du chapitre prudent nous mon- 
trent que le point I* peut etre un point fixe hyperbolique \so\6 quelconque. 
De plus nous parlerons aussi du cas ou 1'angle entre les deux branches au 



point Q ne s'annule pas. Mais si les branches P«Q et Q« P sont tangentes 
au point 0. les mimes raisonnements restent essentiellement valables. En 
effet, les deux branches auront un contact d ordre /ini en ce point; ce fait 
et les rlsultats du dernier chapitre touchant les families des courbes inva- 
riantes dans, le voisinage ordinaire de P nous permettent de dire que nos 
conclusions subsistent, quel que soit l'ordrc du contact. 

II nest pas n^cessaire d’avoir une surface r^guliere de section (cha¬ 
pitre I, section 3 ) pour obtenir un tel voisinage. En effet, si les surfaces 
respectivement a et u> asymptotiques d’une solution A-p^riodique se coupent 
en une solution homocline, sans que ces surfaces soient identiques, il existe 
un tel voisinage dtendu. Pour le voir, supposons qu'une courbe d^crite au 
point V dans lespace des trajectoires S a sc joint au point P de la trajectoire 
A-p^riodique k la trajectoire homocline, en restant entterement dans la sur¬ 
face des trajectoires a asymptotiques- Une partie finie de cette surface u 
asymptotique est homomorphe k un plan, et la famille de trajectoires indi- 
viduellcs est homomorphe k une famille des lignes paralleles dans ce plan. 
Done il est evident qu’on peut choisir cette courbe P«0 de fagon que la 
courbe coupe ses trajectoires dans le meme sens le long de P« 0 De la mfeme 
maniere on obtient une courbe analogue P- Q. Dans ce proc^d^ nous faisons 
abstraction de toute intersection des deux surfaces a et u> asymptotiques, 
sauf au point Q. De cette maniere on obtient une courbe P« Q* P le long 
de laquelle on peut construire une surface locale de section, et ainsi la con¬ 
figuration de la figure. 

Nous allons voir plus tard que, sauf dans des cas tres restrictifs. de 
telles solutions homoclines doivent exister toujours. C’^tait Poincar^ qui 
avait le premier d^montr^ l’existence de telles solutions homoclines dans le 


621 


N00VELLB9 RBCHBRCHBS SUR LB8 8YSTEM BS DTRAMIQUBS _177 

probleme restreint des trois corps pour de petites valeurs d’une des deux 
masses finies (*)• A vrai dire, ses raisonnements ont montr^s pourquoi les so¬ 
lutions homoclines existent toujours dans un probleme dynamique non-intd- 
grable qui diflere tres peu d’un probleme int^grable. 

Construisons maintenant une famille de courbes invariantes entre les 
deux branches asymptotiques. Celles-d remplissent en premier lieu un petit 
voisinage ABCDP du point fixe (voir la figure 19 ). Mais en continuant ces cour¬ 
bes, on obtient un ruban de ces courbes invariantes qui se croisent dans le 
petit quadrilat&re QRST de la maniere indiqu^e. 

Si Ton se souvient des hyperboles invariantes dans le cas d’une trans¬ 
formation u t = qu, v t =v/Q(Q>\) par exemple, avec P«Q correspondant A 
l'axe de a, et P«Q A l’axe de v , on voit que les arcs AB, BC et CD corres¬ 
pondent respectivement aux Equations u = const., uv = const., et t» = const. 
Selon les r^sultats du chapitre prudent ces courbes invariantes sont trbs 
r^guliires. En particulier la direction de la ligne tangente varie continue- 
ment, m&me quand le point tend vers un point d’une des branches asympto¬ 
tiques, le point fixe except^. Nous allons attacher une constante c A ces 
courbes invariables qui se r^duit A z6ro sur les branches asymptotiques. 

Pour bien comprendre la situation, il est avantageux de fixer 1'attention 
exclusivement sur ce ruban de surface en n^gligeant les points homoclines 
interm^diaires qui existent si P. 0 et P- 0 se coupent en d'autres points de 
ces arcs, ce qui est bien possible. 

Done pour toute valeur de e> 0 , on obtient un petit quadrilat&re QllST 
form£ de la maniere indiqu^e dans notre figure. Pour des valeurs de c en¬ 
core plus petites on obtient des quadrilateres QR'ST analogues. Tous les 
points S' se trouvent sur une courbe diagonale QS. 

Ces faits sont ^vidents du point de vue g^om^trique, c’est pourquoi nous 
ne nous arr£terons A les ^tablir en detail. 

J’ai fait autrefois (•) l’observation suivante sur la g£om£trie de cette 
figure, sans r^aliser la portae de la configuration g^om^trique et sans d£ 
montrer les propri£t£s des courbes invariantes ^tablies dans le chapitre 
pr4c4dent qui ^taient impliqudes: il existe un nombre infini de points p£- 
riodiques sur la courbe diagonale Q**- En effet, l’arc SCBS de la courbe in- 

variante contient un nombre k de points S, T(S).En faisant varier S de 

S vers 0 le long de SO, nous aurons toujours une courbe invariante qui se 


(*) Les mtlhodrs nouvelles de la Mieanique celeste, I. 3. 
(*) Voir la Note dtj4 cit*e. 


622 



178 


o 


D. 


BIRKHOFF 


coupe en S' et en mSme temps le nombre V doit croitre vers Tinfini. Done 
nous obtenons une serie de points S*. S*+„ ... correspondants, tels quon 

a T* (S/) == S/ (/ ^ k). . , f , .. 

Par consequent il exisle des solutions pemodiques situees dans le voisinage 
etendu (Tune solution h-piriodique donnee, dont les trajectoires circulent k fois, 
^ \ jois, ... respectivement autour du voisinage etendu de cette solution en 

sapprochant une seule fois de la solution homocline correspond ante. Plus exacte- 
ment il existe des points fixes par les transformations T\T*+',... dont une 
seule des images se trouve dans le quadrilatlre QRST, et telle que les points 
suecessifs font le tour de la region annulaire de la figure en k iterations , en 
k 4-1 Mirations, ... respectivement. 

Cette propria des solutions h -p^riodiques ressemble beaucoup k celle 
des solutions «-p£riodiques non-d^g^neres qui possedent aussi des solutions 
p^riodiques voisines en nombre infini. La seule difference est que dans le 
cas elHptique ces solutions voisines sont situees dans le voisinage ordinaire 
de la solution ptiriodique donnee, tandis que dans le cas hyperbolique elles 
sont situees dans le voisinage etendu, e'est k dire qu’elles s approchent et 
s'eloignent alternativement de la solution periodique donnee, tout en restant 
pres des deux parties des surfaces u et <o asymptotiques qui sont terminees 
par la solution homocline. 

Nous allons voir (sections 3 , 4 . 5 ) comment on peut trouver d autres 
solutions periodiques dans le voisinage etendu d une telle solution ri-perio¬ 
dique aussi bien que toute une hierarchie d'autres solutions voisines.. 


2 . Un lemme. 

Mais avant de pousser plus loin 1'etude des solutions voisines, signa- 
lons le lemme que nous semble jouer un role dans It cas hyperbolique com- 
pietement analogue au lemme de section 6 , chapitre II, dans le cas ellipti- 
que. Quand on se souvient du fait que le lemme enonce auparavant doit 
gtre consider le prototype le plus simple des th^oremes analogues au der¬ 
nier th^oreme de g^om^trie de Poincar^, on comprend l’int 6 r£t du lemme 
analogue suivant: 

Soil R uu ruban infini plan enti'erement rempli if une famille regulibre de 
courbes auxquelles appartiennent les deux bords du ruban. Supposons de plus 
que le ruban se croise dans une region quadrilatbre QRST (voir la figure ci- 
jointe) de fafon que chaque ligne de la famille se coupe une seule fois dans QRST. 
Soil T une transformation, biunivoque. continue et directe du ruban en lui mime 


623 



NOUVELLBS RBCHERCHBS SDR LB8 SYSTEMBS DYNAMIQDES 


179 


gut laisse invariants touts courbe ds la fantilie donnse dss courbes st qui fait 
avanssr Iss dsux bards dans Iss deux sens opposes. Alors il existera un point 
P fixe It finterieur ds QRST. 


Fig. 20. 



Evidemment c'est ce lemme qui entre eflectivemcnt dans la section prt- 
ctdente, et il saute aux yeux que le lemme est vrai. 

II est trts inttressant de remarquer que dans ce cas-ci nous n’avons 
nul besoin de supposer que la transformation T soit conservative. Par 
consequent ce lemme en apparence presque banal s'applique aux systemes 
d'tquations differentielles du troisieme ordre mSme si le systeme nest pas 
d'origine dynamique. Ainsi il nous permet de dtduire l existence des solu¬ 
tions ptriodiques voisincs dans des cas tres gtntraux. 


3. Etude du volsinage ttendu d’une solution A-ptrlodique. 

Etudions maintenant systtmatiquement le voisinage ttendu d’une solu¬ 
tion A-ptriodique. 

Considtrons done les images successives du quadrilatere QRST de la fi¬ 
gure 19 au-dessus, quand on fait les transformations T, T“, ... . Ici encore les 
faits essentiels sautent aux yeux. Les images successives jusqu’& la Ar 14 "* se 
trouvent k l'inttrieur de la region annulaire, avec les cotts Q*R|, Q*R|» ••• 
le long de Qo>P. A partir de ce moment les images sortent du voisinage 
ttendu, en traversant le quadrilatere une ou plusieurs fois en des regions 
successives qui nous numtrotons k -f- 1 , k i,... (voir la figure ci-dessous); 
les regions peuvent £tre constitutes d une ou de plusieurs rtgions simple- 
ment connexcs qui s'ttendent ou jusqu'4 TS ou jusqu'i QR, mais elles con- 
tiendront au moins une partie qui s’ttend de TS 4 QR. % 

Les rtgions successives R*+i, R*+ 2 .... ainsi dtfinies sont rangtes 1 une 
au-dessous de l’autre suivant leurs indices successifs. 

Evidemment tous les points des images du quadrilatere QRST jusqu’i 
la Ar 14 "* sont situts dans la rtgion annulaire en dehors de QRST lui-mtme. 


624 



180 


G. D. BIRKHOFF 


Les points de la [{k + 1)*— region sent preeminent ceux qui reviennent * 
OUST pour la premiere fois apres k -f- 1 iterations de T. D'une mamere ana¬ 
logue les points de R *+2 sont predsemen: deux qui reviennent k QRST apres 



k-\-± iterations; et ainsi de suite. Done les regions tout k fait distinctes 
lU+i,R*+a."’ doivent contenir tous les points de QRST qu i peuvent rester 
dans la region annulaire aveC Iteration indefinie de T. 

II est evident aussi que R. tend vers TO quand n croft indefiniment. 

Maintenant nous recommencons de la maniere suivante. Au lieu de con- 
siderer QRST, nous ne considerons que les regions R*,R*+i,... dans I'intd- 
rieur de QRST. Ces regions resteront dans la region annulaire par les trans¬ 
formations T,T\ comme auparavant. Mais les images de ces regions 

par T* doivent traverser R* de la maniere indiquee dans la figure. Nous 
aurons done des regions R*. *, R*, *+«»..R*. /». • •» toutes dans R*, dont cha- 
cune s'etend ou jusqu'A TS ou a QR et dont une au moins s'etend de TS k QR. 
Semblablement nous definissons les regions R*+i,a, ... R*+«» *+-i* ••• dans 
et ainsi de suite. 

Evidemment les regions R„,/ ainsi obtenues sont composees preeminent 
des points de QRST qui ne sortent pas de la region annulaire avant de cir- 
culer au moins deux fois en le sens negatif autour de la region annulaire; 
plus exactement, un point Q quelconque de R„,/ revient k QRST pour la 
premiere fois apres m iterations de T‘\ et pour la deuxieme fois apres / ite¬ 
rations de plus de T - '. 

De m£me maniere nous obtenons des regions R~././ dont les points cir- 
culent au moins trois fois en le sens negatif autour de la meme region, la 
premiere fois apres m iterations de T 1 , la deuxieme fois apres / iterations, 
et la troisieme fois apres p iterations. 


625 



NOUVBLLBS RBCHERCHES 30R LBS STSTBMBS DTNAMIQOB8 


IS1 


Done on obtient une s^rie ind^finie des regions R M . a/•••«.' » done chacune 
est compos^e de regions simplement connexes qui s’^tendent ou jusqu’A TS 
ou jusqu’A QR, et done une au moins s'^tend de TS A QR. Pour ces regions 
nous aurons toujours 

R«.a/~«.» < R-.a-.* 

pour chaque valeur de m,/, p,...; ici la relation A*<B indique que A est 
incluse dans l'inferieur de B. 

Apres cette analyse il est Evident que les points de QRST qui restent dans 
la region annulaire quand T'* est ind^finiment r^p^fee, soot pr^cis^ment ceux 
qui sont contenus dans une suite infinie des regions: 

R- * R—. /» • • • 

Un tel point se trouve dans QRST apres m iterations, apres l iterations de 
plus, et ainsi de suite. Mais notre figure ci-dessus rend visible le fait que 
l'ensemble des points de cet espece pour une suite donnee m,l,p y ... est la 
somme d’une infinite denombrable de 4 branches \ toutes d'un seul tenant, 
qui s’etendent jusqu'A TS ou jusqu'A OR et dont une au moins s'etend de 
TS A QR- Nous designerons l'ensemble de points de cet espece par le sym- 
bole [m, /, p,... .)• 

Les ensembles differents [w,/, p,..... n'ont pas des points en commun 
et sont ordonnes dans QRST de la maniere suivante: si Ton considere deux 
ensembles [m\/\ ..) tels que la premiere difference 

tn' — m, / — l\ p' — p, . • . 

non nulle est negative, l'ensemble [mj, ...| se trouve au-dessous de ... 1, 
e’est A dire plus 41oign£ de TQ. 

En effet il parait de la nfeme figure que Tare TQ de P- 0 est « au-des¬ 
sous ► de SR, done, en regardant la region R/ dans sa partie la plus doigifee 
de TQ qui s'etend de TS A QR. que R/ aura pour son c6fe supirieur un arc 
du prolongement de la branche P.Q- En gdndra! si nous consid^rons 

R-.A-* et R-. 

il est Evident que si R-./..., est au-dessous de R-./'...✓ puis, R«.... est au- 
dessus de R-/...^- En d'autre termes, si l'ensemble [m, est au-dessous 

de [m, l' ...s\ puis est au-dessus de (I. — a]- Mais R.- est au-dessous 

de R« si m'>m. Done la regie se trouve justiffee. 

S«rie III, Vol. I. 13 


626 




182 


O. D. BIRKHOFF 


De notre point de vue les parties ‘ essentielles ’ de /..../et de (w, / 
sont fortunes respectivement par la region de R«. et par le sous*ensemble 
connexe de [m,/, ...] s^tendant de TS a QR qui est le moins 6\oign6 de TQ. 

Evidemment ces ensembles [m,/, ...| divers sont ordonn^s exactement 
de la m$me fagon que les points d’un ensemble sur une ligne. 

Nous r^sumons nos r^sultats de la maniere suivante: 

Les points d’un tel quadrilatlre QRST gut ne pas sent jamais e n dehors du 
voisinage etendu, en rlpltant indefiniment la transformation T\ sont preasement 
les points des ensembles fermes (m, /, ;>\..). Chaque ensemble est compose de bran¬ 
ches dun sent tenant qui s’elendenl ou jusquh TS ou jusqua QR, et dont une au 
moins setend de TS 4 QR. Ces ensembles [m, l, p ...) n'ont pas de points en 
commun et sont ordonnes I'un audessus de l'autre dans le mime ordre que les 
indues successifs. Un ensemble particulier \m,l.p ...] correspond pricislment 
aux points qui reviennent a QRST aprls un tour du voisinage etendu dans m 
iterations successives de T \ qui revient encore aprls un autre tour dans l ite¬ 
rations, et ainsi de suite. 

Nous pouvons maintenant d^montrer Pexistence d'autres types de points 
p^riodiques: 

Settlement les ensembles [m, l.p,...} avec une suite plriodique peuvent avoir une 
mesure positive. Un tel ensemble contient loujours au moins un point piriodique 
qui revient h QRST aprls m itirations de T, a pres l itirations de plus, et ainsi 
de suite, tout en revenant a sa position initiate quand la plriode des entiers 
m,l, ... a lie parcourue. 

Pour d^montrer ces faits nous remarquons la relation 

T'" [wi, /, p ...]) ^ [/» p% • • -3 • 

En effet, apres m repetitions de T 1 tout point de [m, /, p ...J sera transforme 

en un point de [/, p,...]. Mais la mesure en le sens de jjQdudv est inva- 

riante par la transformation conservative T. Si la suite m, /, p, q ... ne terminait 
pas en une suite purement periodique, nous trouverions de cette fagon une 
suite infinie d’ensembles sans points communs, 

[ m » P* • • •]» Us ?»•••]• [p» •••]»•••» 

et de mesures croissantes ou au moins non-decroissantes, tous contenus k 
Pinterieur de QRST. Cela n’est possible que si la mesure de chaque ensem¬ 
ble est nulle. 

Dans le cas purement periodique Pindice est de la forme 

m, /, p, ... #, m, p ... 


627 


NOOVBLLBS RECHERCHES SDR LBS SYSTEMBS DYNAMIQUB9 


183 


L'ensemble transform* T <"•** - *> K/, ...] pour tout ;<0 doit alors faire 
partie de [m v /,p ...J lui-meme, et en mftme temps poss*der la meme mesure. 
Done ...) consistera en un ensemble ferm*, S, commun k tous ces 

ensembles transform^ et en un ensemble S' de mesure nulle. Tous les points 
du premier ensemble resteront 4 l’int*rieur de QRST sous Titration r*p*t*e 
de T"*’ ”' ou de son inverse. 

Supposons maintenant que I'indice ne soit pas purement p*riodique, mais 
termine en une suite p*riodique. L'ensemble correspondant [a, p, . • • K I, 
... ro,/,...) sera transform* par T~< m+ * + - x) en un sous-ensemble de Pen- 
semble p*riodique [m, /, ...J. Mais ce sous ensemble n'a aucune partie en 
commun avec l'ensemble S d*fmi ci-dessus. Done l’ensemble transform* est 
une partie de S' et doit aussi *tre de mesure nulle. 

II nous reste k d*montrer que chaque ensemble avec une suite purement 
p*riodique contient au moins un point p*riodique avec la propri*t* indi- 
qu*e. Remarquons qu’une partie quelconque de R«...« qui s *tend de TS k QR 

sera transform* par T-+. en une r*gion R ...» qui s'*tend au moins une 

fois de TO a SR dans R-...,- En r*p*tant cette translormation on obtient 
une suite infinie des r*gions R“\-.~., R ••• qui contient un ensemble 

ferm* connexe RV.~» (sous-ensemble de l'ensemble p*riodique [m,/,...]) qui 
s’*tend de TS k OR et dont l'image, par la transformation T~+-', contient 



T-4-...* (RV• 


En d'autres termes l’ensemble ferm* connexe R*-.,...« est transform* en une 
partie de lui-m*me par la transformation T-«-+-« On d*duit done d’un 
th*oreme bien connu de Brouwer qu'il existe au moins un point invariant 
par T-+- +* dans cet ensemble, comme nous l'avons dit. 

Passons maintenant a la consid*ration des solutions qui restent dans le 
voisinage *tendu, quand on repete ind*finiment la transformation T. Evident 



fig 22 . 


628 


184 


O. D. BIRKHOFF 


ment, ce sent les points des ensembles [m'.f.p’, -T tout 4 fait analogues aux 
ensembles ddfinis plus haul Observons que ces ensembles splendent ou lusqu a 
SR ou jusqu’4 TQ et que Pun au moins deux s'^tend de SR k TQ (voir la 

gure ci-jointe). . 

Mais chaque paire des branches de [m,/, ...) et de [m\ qut s ^ten- 

dent des c 6 tds SR et TS du quadrilatere QBST * TQ et QR respectivement. 


doit se couper au moins une fois. 

II sensuit quit exisle au moins un point P de ORST qui tie sort jamais 
de la rigion annulaire, qui se trouve en QRST apris m iterations de T*\ apres l 
iterations de plus, et en mime temps dans QRST apris m' iterations de la 

transformation inverse T, apt is /' iterations de plus . Ici m, l, , 

m, i\ p', • • • ' sont entiers arbitrages k\ 

D^signons lensemble ferm<5 des points P de cet espece par le symbole 
doublement infini 


r 

L- 


. . m, p, /, m, m\ p' . . .] . 


Cet ensemble de points nest que lensemble de points qui appartient k 
...) et k [m\/\p' ...) k la fois. De tels symboles arithm&iques sont 
bien adapts a la discussion de tous les points qui restent toujours dans le 
voisinage ^tendu du point fixe ; ils ressemblent un peu aux symboles effective- 
ment introduits par Hadamard dans son 4tude remarquable des g^od&iiques 
sur certaines surfaces ouvertes de courbure totale negative (loc. cit.). 

II parait done que tout systeme dynamique non-int^grable qui admet 
une seule solution homocline de cette espece, doit admettre une hterarchie 
presque inconcevable de solutions dans le voisinage <*tendu correspondant. 
Parmi ces solutions signalons les suivants: 


(1) Les solutions pdriodiques rattach^s aux symboles p^riodiques quelconques. 

( 2 ) Les solutions asymptotiques k de telles solutions p^riodiques, quand le 

temps croit (ou d^croit), dont le symbole infini se termine (ou com¬ 
mence) avec une suite p^riodique arbitrage correspondante. 

( 3 ) Les solutions r^currentes dont le symbole est recurrent, e'est a dire tel 

que chaque suite de n entiers (n arbitraire) dans le symbole, se trouve 
au moins une fois dans toute suite de N entiers du symbole. 

( 4 ) Les solutions asymptotiques a de telles solutions r^currentes dont le sym¬ 

bole se termine ou commence par une suite r^currente. 

( 5 ) Les solutions asymptotiques aux deux solutions p^riodiques ou r^curren- 

tes voisines donn^es dans les deux sens du temps; ici les deux parties 



NODVBLLB3 RBCHSRCHBS SDR LBS ST8TBMBS DYNAM1QOBS 


185 


extremes du symbole sont donn^es, d’avance, tandis que la partie intcr- 
m^diaire ne Test pas. 

( 6 ) Les solutions sp^ciales de notre ensemble dont les symboles ne contien- 

nent pas toutes les suites possibles des entiers. 

( 7 ) Les solutions g^n^rales de notre ensemble dont les symboles contiennent 

toutes les suites possibles des entiers. 

Remarquons en conclusion que notre 4tude nous montre l’existence n£- 
cessaire d’un nombre infini des solutions homoclines (*)• Par consequent nous 
pouvons obtenir beaucoup. d’autres solutions voisines, en partant d autres so¬ 
lutions homoclines qui appartiennent k la mftme solution p^riodique. 


4. Le voislnage etendu d’un cycle des solutions 2i-p*rlodlques. 

Nous voulons montrer maintenant comment on peut remplacer les solu¬ 
tions homoclines par un cycle des solutions h^teroclines. 

Commencons avec le cas le plus simple de deux solutions A-periodiques 
qui correspondent aux points invariantes P, et P, de la surface S,. Soient Q it 
et 0 .. deux points hduiroclines, tels que 0 ,. se trouvant sur la branche a (a>) 
issue de 0 , ( 0 ,), tandisque Q«i se trouve sur la branche <0 (a) issue 0 , (Q») 
{voir la figure). Construisons un voisinage £tendu de deux mouvements tel 
que l’indique la figure. 



En partant du quadrilatere Q lt R„ S„ T„ et en consid^rant ses images 
n^cessaires par T, 1\ ... on arrive k une image qui traverse 

Q„ R„ S„ T„ de T„ S„ 4 Q„ R„, et telle que toutes les images ultdrieures 
le traversent aussi. 


(•) PoincarB a remarquft ce fait dans le cas du problfcme des trois corps. Us mitkodes 
nouvslUs <U la Micaniqut ctlestf. t. 3. 


630 



o. D. BIRKHOFF 


186 

Mais en considerant une telle partie de Q tl R„ S tt T„ on voit de la mfime 
maniere que sa (m t + \)°~ image ainsi que toutes ces images ulterieures doit 

traverser 0, t R,« T„. 

Si Ton continue ce procddd indefimment, on obtient evidemment des 

ensembles R',•••• R'--.--. f . — ■ ("* m * '• '> "1 ( m '‘ m '> ' “' f " J ’ 
ainsi un ensemble (contenant au moms un point I* de Q„ R*. S„ associe 

aVec un symbole doublement infini 

[.. . m t , m„ m\ t t , / n /,. . 

avec les propri^s suivantes: le point P ne sort jamais de la region annulaire 
de la figure, apres m, iterations de T ‘ il se trouve dans Q t , R,, S tl T,» , apres 
m f iterations de plus dans Q lt R„ S t# T i# , apr*s /, iterations de plus dans 
0 R*« S,, .... apr&s m, iterations de T il se trouve dans 0 tt R*i T„, 

et"ainsi de suite. Ici les entiers ro,. m',, /,, l\, ... sont arbitrages, sauf pour 
la condition d’etre plus grands que k, (1 = 1,2). 

D'utu maniire semblable on voit quit tout cycle 

P. a 0,. a) P, a On <*> • • • **' a On <*> P t » 

ou P,. •■ P / sonts des points fixes hyper Solique s, et 0,,, Q„. des points hitiroc lines 
tels que Q/./+. se trouvant sur la a tranche issue de P., et sur la a >tranche issue 
de P/fi, H correspond un voisinage etendu d'un tel cycle. Tout point qui ne sort 
jamais de ce voisinage etendu corresponds a un symbole analogue h celui dijh 
obtenu dans U cas d'un seul point fixe hyperbolique. 

De cette facon on peut etendre complement le symbolisme et les risultats 
obtenus au-dessus pour un seul point fixe hyperbolique aux cycles de k > 1 points 
fixes hyperboliques joints successivement par des branches a et a> asymptoti- 
ques aux points hetiroclines en commun. // peut mime arriver ici que parmi 
les points P,, P,, ... P/ quelques uns (ou tous) correspondent a la mime solution 
h piriodique. 

Quoique l’dtude des solutions voisines des solutions e- ou A-periodiques 
donnees nous ait conduits & une variete extraordinaire et presque inconce- 
vable de telles solutions, il est evident que cette methode d’attaque ne peut 
pas rendre compte de toutes les solutions, puisque il n*y a aucune raison 
a croire qu’elles restent en general dans un tel voisinage restreint. Pour 
dominer la totalite des solutions il faut done introduire d autres methodes. 
C’est ce que nous allons faire dans notre dernier chapitre, tout en faisant 
un emploi fondamental des resultats que nous venons dobtenir. 


631 



187 


NOOVELLBS RECHBRCHBS SDR LES SYSTEMES DYNAMIQDES 


CHAPITRE V. 

Theorie g6n6rale des solutions. 

i. Premier apertu. 

Les types possibles de systemes dynamiques (n = 3) sont extrfcmement 
nombreux. En premier lieu il y a les systemes qui sont transitifs. Ces sy¬ 
stemes sont caracttfrisis par la propria suivante: Choisissons deux points 
quelconques P, et 0. de S,. On peut trouver alors deux points P, et 0, dans 
le voisinage de P„ et 0. respectivement, tels que l’arc dune trajectoire de 
mouvement joint P, 4 0.. Le cas transitii est presque certainement le cas 
. general.; en ce cas il n existe pas des regions invariantes dans S,. 

A l'autre extreme on trouve les cas intdgrables- Ici il y a une integrate 
uniforme autre que celle de ltenergie; toutes les trajectoires sont alors situdes 
sur une famille analytique des surfaces, et le systfeme est intransitif. 

Entre le cas tres special de Pintdgrabilitd et le cas de transitivity nous 
avons beaucoup des possibility intransitives intermddiares. 

Done pour un systeme donnd il faut en premiere ligne determiner s - il 
est transitii ou non. Mais comment determiner les regions invariantes de S, 
s'il en existent? La determination directe de ces regions se fait settlement par 
l'emploi des precedes infinis qu'on ne peut pas actuellement completer; et 
la demonstration qu’il n'en existe pas, parait etre enormement difficile sauf 
peut-Stre dans le cas oil il n'y a pas une seule trajectoire e-pdriodique 
et l'espace S, est fernte; le probleme des gdodesiques sur une surface 
fermee de courbure totale negative appartient i cette categorie speciale- 
En effet malgre les plus grands efforts des geometres on n'a meme pu 
demontrer que de telles regions invariantes n'existent pas dans le voisinage 
immediat de toute trajectoire r-periodique. Voite le probleme cetebre de 
la stability dans sa forme la plus primitive. J'ai rdussi k demontrer dans 
mon Memoire deja cite que de telles regions (s'il en existe) doivent avoir 
une forme assez reguliere pour qu'on puisse esperer aller plus loin par 
l'emploi des precedes reguliers de 1 analyse. J'ai rdussi plus recemment (‘) & 


(•) Sur Vexislenee des rtgions annulaires de VinslabiliU, Annales de 1’InslltUt Henn Poin 
car*, t. 3 (1931). 


632 



188 


O. D. BIRKHOFF 


dymontrer mJmc I'exislence des regions annulaires d’instabilitd, ce qui 
indique que I'hypothese de transitivity est presque certainement remplie 
dans le cas gdnyral. D'ailleurs cette hypothese a toujours dtd employee 
par les physiciens avec grand profit dans la thdorte de la mdcanique 

Cest pourquoi nous allons nous occuper de la thdorie gdndrale d un 
tel systfeme transit!!. Nous indiquerons aussi jusqu'a quel point subsistent 
des rdsultats analogues quand I'hypothese de transitivity nest pas sa- 

tisfaite. , . 

II y a une seconde hypothese que nous allons employer dans ce chapitre, 

* savoir qu’il existe une surface r^guliere de section S t . En effet nous 
avons ddj* vu que les systemes dynamiques de deux degr^s de liberty sont 
divis^s en deux categories: les uns pour lesquels une telle surface S, existe, 
et les autres, plus compliqu^s, pour lesquels il n en existe pas. Mats les cas 
qui sont les plus interessants du point de vue des applications possedent 
des surfaces regulieres de section. 

Dans notre theorie les solutions p^riodiques joueront le role central. 
Mais il y a au moins deux cas tres simples ou de telles solutions n existent 
pas en nombre infini tandis qu'une surface r^guliere de section existe, a savoir 
le cas d’une surface r^guliere de section S, de genre 0 jointe k une transfor¬ 
mation T d’une rotation ordinaire par un angle incommensurable avec 2 n, 
et le cas d’une telle S, de genre 1, jointe k une transformation T de la 
forme <p' * <p -f- a, + P oil <p et y sont des coordonn^es sur S, et les 

constantes a , p et 2 * sont incommensurables. Dans le premier cas il n existe 
que deux solutions p^riodiques elliptiques d^neres avec des coefficients 
de rotation egaux en valeur absolue mais des signes opposes; dans le 
second cas il n’existe pas une seule solution periodique. En tous les deux 
cas failure des points it^r^s successifs doit tore consideree comme comple- 
tement connue et par consequent les deux cas sont k considerer comme 
« integrates *. 

Il me semble bien probable que ces deux cas soy it effectivemcnt Us seuls oil 
il rientre pas un nombre infini des solutiofis periodiques. mais je tiai pas pu 


encore le demontrer. 

Afin d’eviter la consideration de tels systemes qui sont extraordi- 
nairement spdeiaux sans aucun doute, nous allons supposer aussi qu il 
n’existe aucune solution periodique du type elliptique degenere et qu’il 
existe au moins une solution periodique. Cependant, nous indiquerons 
comment ces hypotheses peuvent tore remplacees par dautres moins 
restrictives. 


633 



NOOVBLLBS RBCHBRCHBS SDR LBS 8TSTBMBS DYHAMIQUB8 


189 


Done nous allons faire les trois hypotheses suivantes: 

(1) Ie systeme consider^ est transit'd dans l'espace S, (*); 

( 2 ) il admet une surface r^guliere de section S # ; 

( 3 ) i! admet au moins une solution p^riodique, mais il n’y a pas des 
solutions **periodiques d£g£n£res. 

Nous retenons, bien entendu, Thypothese que le systeme consider^ est 
analytique et nonsingulier. 


2 . Density des solutions asymptotiques d’une solution ptriodlque. 

L'analyse que nous avons faite nous conduit immSdiatement au r^sultat 
suivant sur les ensembles asymptotiques: 

Dans Us hypotheses indiquees (D. /. h. «.) tout ensembU asymptotique E« ou 
Ew rattache h un point fixe, qui n'est pas une branch* asymptotique double 
rattacfUe h deux points fixes hyperbotiques, est par tout dense dans S, (*)• 

Cel a ne veut que dire que chaque ensemble correspondant des trajectoires 
asymptotiques est partout dense dans S,. 

Commencons par consid^rer le cas d'un point fixe elliptique (non-d£g£n£re) 
et supposons en premier lieu que la trajectoire fermde correspondante ne forme 
pas l’un des bords de la surface de section S,. Cette trajectoire coupera S g 
en n points differents P„ P„ •. • P. (n ^ 1) tels que 

P, - T(P») « P, - T(P t ).P, - T(PJ , 

d’ou T n (P/) = P., e’est k dire les n points P, sont des points fixes de la 
transformation conservative T*. 

Traitons le cas n =* l ou l'on a T(P) — P ; les mtithodes et les r<*sultats 
pour n I sont completement analogues. 

Comme nous I'avons vu, il existe alors deux ensembles entrelagants 
2a et 2-, qui sont ferm^s et connexes k P, et qui s^parent le point fixe P 


(•) Nous employons I’hypothise (1) dans le sens un peu plus fort qu'aucune des transfor¬ 
mations T, T-', T*. T-*.... ne laisse invariante une courbe fermte de frontiire sur S,. mtme 
si cette courbe n’inclut pas une region invariante de S,- 

(•) Dans S. T., j'ai seulement d*montr* la partie de ce rftsultat concemant les points fixes 
elliptiques du type ginCral. Grace aux risultats que nous avons obtenus en haut nous sotmnes 
certains maintenant que ce risultat se g*nftralise 4 tout point fixe elliptique non d*g*nfcre. 


634 




190 


O. D. BIRKHOFF 


des points en dehors d un voisinage petit arbitrairent de P. Avant d'aller plus 
loin il faut expliquer ce qui sont les ensembles E« et Ew relatifs a un tel point 
fixe elliptique; dans le cas d’un point hyperbolique E« et E« sont naturellement 
les prolongements indefinis des branches asymptotiques a et to (analytiques 
ou hypercontinues). 

Choisissons un voisinage quelconque d'un point fixe elliptique P- Dans 
ce voisinage il existe deux ensembles entrelagants 2a et 2» des points a et to 
asymptotiques: ces ensembles 2a et 2- sont ferm^s et connexes k P et s’4- , 

tendent jusqu’au bord du voisinage choisi; les images successives de 2« et 2* 
par T 1 et T respectivement restent dans ce voisinage et tendent uniforme* 
ment vers P. De plus, on peut trouver une frontiere ferm^e autour de P, 
de diametre arbitrairement petit, qui est compost des points de 2a et 2* 
exclusivement. 

D’autre part, l'image 2« de 2a par T et l’image 2» de 2» par T*‘ doivent 
sortir du voisinage choisi- Autrement nous conclusions par exemple, que 
2 '« sera inclus dans 2«, c'est k dire que 

T(2«)^2. . 

Mais cela n^cessiterait que l ensemblc 2« restat k l'int^rieur du voisinage 
choisi sous la repetition de T aussi bien que de T*\ c’est a dire que 2 ftit un 
ensemble iermt connexe k P des points a<o, ce qui n est pas possible dans 
le cas intransitif. 

Definissons maintenant les ensembles Ea et E« de la maniere suivante: 

E. - 2. -f 2'. +- 2«. -f • • • , 

L - 2* + 2'. + 2* w + ... . 

Ces ensembles Ea et E« sont en general ouverts puisqu’ils sont les limites 
des ensembles fermes croissants. 

La definition de Ea (et aussi de E-) ne depend pas du voisinage choisi, 
pourvu seulement qu’il soit suffisamment petit. En effet, il est evident qu en 
choisissant un voisinage plus petit 1'ensemble 2a* correspondant sera inclus 
dans 2*, et done Ea* dans Ea. D’ailleurs apres un nombre d’iterations de T"‘ 
suffisamment grand, 2a sera transforme en une partie de 1'ensemble 2a*. 
Done on a 

Ea £ T * (Ea) £ Ea*. 

On conclut que les deux ensembles Ea et Ea* ne dependent pas du voisi¬ 
nage choisi. 


635 



NOOVBLLES RBCHERCHBS SUR LBS SYSfEMBS DYWAMIQUBS 


191 


Remarquons le fait Evident que les deux ensembles Ea et E« sont trans¬ 
form^ en eux-m£mes pas T et 1 . 

II peut bien exister des points a et o> qui tendent vers un point p^r.o- 
dique P asymptotiquement sans appartenir soit k E« soit k E«. 

Observons nontenant que lensemble E, + E« doit Sire partout dense. 

En effet, dans le cas contraire on trouverait une region ouverte S de S, 
qui contient ni des points de E« ou E«, aucun de ses points lim.tes. La 
fronttere de S serait compost des points de E: et E» et de leurs points 

% 

lirnites. a 

Une telle region S ne peut pas coincider avec S, mais doit fetre eftecti- 

vement une partie de S„ puisque E, et E- sdparent !e voisinage du point 
pdriodique I* du reste de S t . Mais E. et E*. sont invar.antes par T, done 
I' image d'une telle region, T(S), coincide ou avec S ou avec une autre rdgion 
entierement diffdrente de la mSme espcce. Puisque l'aire totale de S, est 
finie, les images successive* S, T(S).... ne peuvent P as 4tre toutes d.ffdren- 
tes entre elles, c'est-4-dirc nous aurons ndeessairement T*(S) —S pour une 
valeur convenable de *#0. En ce cas la transformation T ne serait pas 
transitive, ce qui contredirait notre hypothese (i) de la transitivity. 

Ddmontrons de plus que les deux ensembles E. et E» doivent 4tre sdpa- 
rdment partout denses. En supposant le contraire, par cxemple que E. ne 
lest pas, il devrait exister une region ouverte S de S, qui ne contient m le 
point fixe P ni aucun point de E,. Choisissons un voisinage D du point P et 
dans ce voisinage, un voisinage D* de P dont le bord consiste enticement 
des points de 2, et 2- pris relatifs 4 D. ConsidCons les images successive* 
de D,*, I),*... de D* par T, qui doivent toutes contenir le point P. A cause 
de la transitivity, au moins une de ces regions doit avoir une partie en 
commun avec S, autrement la partie de S, occlue par ses images (ormerait 
une rdgion invariante. Mais les points du bord de D* qui appartiennent 4 
2. ne splendent jamais en dehors de D. Done il faut que des points de 2. 
se trouvent dans S, ce que nous voulons d^montrer. 

Jusqu'ici nous avons suppose que la solution *-p<*riodique ne correspond 
pas k un bord de S,. Mais le cas ou elle correspond a un tel bord n offre 
pas de difficulty. En effet. construisons une surface locale de section qui 
coupe la trajectoire correspondante dans un angle non mil. Il existe des 
regions autour du point fixe correspondant, dont les frontieres sont formdes 
des ensembles 2* et Les points correspondants de S, constituent la 
frontiere d une region analogue dans S, qui s’^tend tout le long du bord 
correspondant. En employant ce fait on peut raisonner encore de la meme 
maniere. 


636 



o. D. BIRKHOFF 


192 

Dorinavant nous ne faisons plus aucune r^rence sp^ciale aux bords 
de S„ puisque ces bords fonctionnent A tous les dgards comme des points 
fixes ordinaires de S,. 

II reste maintenant a considerer le cas d un point fixe hyperbolique. 
Nous allons supposer en premier lieu qu‘ il n'existe pas des branches doubles. 

Cette condition est remplie en glnlral. 

Soil Ea une a branche issue d un tel point 1\ Nous voulons d^montrer 
que E« doit etre partout dense. Si une branche adjacente E« issue du mftme 
point fixe P coupe E« dans un point homocline 0, nous pouvons raisonner 
essentiellement comme auparavant. En effet les deux arcs P«Q et Q«P consti¬ 
tuent une courbe ferm^e dans S t . Avec Titration ind^finie de T, l’arc Q*P 
tend vers 1* (voir la figure ci-jointe). 



Done si Ea nest pas dense partout, la surface S. sera divis^e en des 
regions ouvertes par ses images par T, et ces regions devraient fctre inva- 
riantes par T. Mais cela nest pas possible puisque le systeme donn£ est 
transitif. 

Nous pouvons done supposer que E. ne coupe aucune de ses deux 
branches E- adjacentes. 

Consid^rons maintenant l'ensemble des points limites L de E«. Ils consti¬ 
tuent un ensemble (crn\6 connexe qui est invariant par T. A cause de la 
transitivity cet ensemble ne peut pas contenir des points int^rieurs et doit 



637 



NOUVBLLES RBCHBRCHBS SDR LBS SY3TEMBS DTNAMIQUB3 


103 


Stre dun seul tenant. Selon le th^oreme bien connu de Brouwer, L admet 
done au moins un point fixe. Je dis que dans ce cas tout ensemble tel que 
I. doit contenir au moins un point fixe elliptique. 

En effet, dans le cas contraire il y a un L qui ne contient que des points 
fixes hyperboliques P*. Cet ensemble L doit contenir au moins une des 
branches Ea* ou E-* issues dun tel point P*. Pour voir cela {voir la figure 
ci-jointe) remarquons qu’i! existe une partie ferm^e connexe de L qui s’^tend 
du point fixe P* jusqu* la circonference d'un petit cercle avec centre au 
point P*. Cette partie ne peut pas fctre situ^e entre deux branches adja- 
centes puisque avec Titration r^p^t^e de T et de T * un tel ensemble aurait 
E«* et Ew* respectivement pour des points limites et ces points limites doi- 
vent appartenir k L. Mais cette partie et les branches E«* ou E«* ne peuvent 
pas inclure une aire. Autrement par Htdration ind^finie de T 4 (ou de T) on 
conclurait que L est partout dense. 

Nous voyons par consequent que L contient au moins une branche Ea* 
ou Ew* issue d'un tel point P fixe hyperbolique P*. Mais tous les points 
limites dun tel ensemble Ea* appartiennent k L. Done on obtient ou L ou 
un sous-ensemble L* de L avec les m£mes propriet^s. 

En partant de cette fagon d'une branche Ea ou Ew quelconque on peut 
remplacer L par d'autres sous-ensembles analogues L* L**,... jusqu’au 
moment ou un tel ensemble I est obtenu avec les propri^s suivantes: 
(i) C est ferm£ connexe et d'un seul tenant sans point intdrieur: ( 2 ) T(L) =1,: 
( 3 ) t n’admet que des points fixes hyperboliques: ( 4 ) L contient au moins 
une des branches Ea ou E- issues d'un tel point fixe quelconque: ( 5 ) L est 
l’ensemble des points limites de chacune de ces branches E« ou E«. Nous 
sommes certains, qu'un tel I existe, puisqu'il n'y a qu’un nombre fini de 
points fixes par T selon nos hypotheses. 

D^montrons qu'un tel L ne peut pas exister. Partons d'un de ses points 
fixes P et d une des branches Ea correspondantes issues de P qui appar- 



638 



O. D. BIRKHOFF 


194 

tient k 17. Suivant ( 5 ) il parait que la courbe E« doit s'approcher encore du 
mSme point fixe P mais sans couper les deux branches E« qui sont adja- 
centes- Autrement, en partant de la configuration P« 0«P de la figure 24 et 
en rdp^tant ind^finiment la transformation T, on d^montrerait imm^diatement 
que E« doit 6 tre partout dense. 

Construisons maintenani une region QRUremplie par des courbes 
invariantes (voir la figure ci-jointe). 

II parait que la branche E« ne peut pas s'approcher du voisinage de P 
qu'en p^n^trant dans cette region le long de VUU f . En d^plagant le point 
V entre V et V,, et en prenant la courbe invariante Ul\ un peu plus pr£s 
de VPQ, il parait qu'on peut prendre le premier point d'intersection de Ea 
avec VUV, au point U lui m£me. 

Effectuons maintenant la transformation T. I-a courbe PE.U sera trans. 
form^e en PE-UU, ou Pare UU, ne peut pas couper V,U,RQ. 

Mais si le point V est choisi dans le voisinage du point fixe P, la courbe 
fermde simple PE.UV de cette figure se trouve pres de L, qui est ferm4 et 
simplement connexe. Par consequent la courbe ferm^e UVPQRU e t sa region 
interieure se trouvent dans le voisinage imm^diat de I.. Done nous pouvons 
supposer que cette courbe n’inclut aucun point fixe outre ceux de L. 

Neanmoins un examen de Xanalysis situs de la figure au-dessus nous 
montre qu'en faisant d^crire au point K la courbe ferm^e UVPQRU, la 
rotation du vecteur KK, qui joint K k son image K, par T sera preeminent 
4 - *n, si cette courbe est decrite dans le sens positif. Done il y a au moins 
un point fixe k l'interieur de L dont 1*indice est -f 1 , et la courbe UVPQRU 
doit entourer I.. 

Mais cette possibilite doit £tre exclue aussi. En effet, il parait aussi de la 
m ghie figure que l'aire interieure k UVPQRU doit etre simplement connexe et 
serait transform^ en une partie d elle-meme par T, ce qui est impossible. 

Par consequent toute branche E, (ou Ew) issue d un point fixe hyperbo- 
lique qui nest pas partout dense doit p<£n<*trer dans le voisinage imm^diat 
d'au moins un point fixe elliptique. 

Si une branche adjacente Ew est partout dense, il doit p^n^trer dans le 
voisinage imm^diat du m£me point fixe, done E* et E« doivent se couper, 
puisque E* et E« ne peuvent pas couper les ensembles 2* et 2« asympto- 
tiques respectivement, issues du point elliptique. En ce cas nous aurions 
trouv 6 un point homocline, ce qui est impossible, puisque en ce cas E« 
devrait etre partout dense. 

Mais lensemble E. et l’ensemble adjacente E- doivent se couper en des 
points homoclines dans le cas contraire aussi. En effet cela est Evident si 


639 



NO0VEL1.ES RBC11ERCHES SOR LES SYSTEMES DYN*MIQOES 


195 


E, et E» piindtrent dans le voisinage immddiat du meme point fixe ellipti- 
que l>. S'its pdnetrent dans le voisinage de deux points fixes elliptiques 
diffdrents I*, et l>,. nous raisonnons de la maniere suivante. Lensemble 2» 
rattachd a P, est partout dense et pdnetre dans le voisinage de P. sans 
avoir aucun point en commun avec l ensemble 2- de P,. La branche E. 
pdnetre aussi dans le voisinage imnxSdiat de P, sans avoir aucun point en 
commun avec l ensemble 2- de P,. Done E. et l ensemble 2- de P, se coupent. 
11 s ensuit que E. pinetre aussi, dans le voisinage de P„ done, que E« et £. 
se coupent en des points homoclines. _ 

Le rdsultat dnoned est maintenant d^montrd dans le cas oil il n existe 
pas de branches invariantes doubles rattachdes a deux points hyperbo- 

liques. ... , 

Pour trailer les cas oil il existe des branches doubles nous n avons qu a 
modifier le raisonnement comme il suit- Nous traitons tous les points fixes 
hyperboliques qui sont connexes entre eux par des branches doubles comme 
constituant un seul . point fixe . de T (voir la figure 2 7 ). Un tel ensemble 



(« point .) est lermd et simplement connexe et possede un nombre pair des 
branches qui ne sont pas doubles et qui sont alternativement du type a et 
du type o>. En considdrant cet ensemble de points fixes et de branches dou¬ 
bles comme un point fixe hyperbolique et en employant les proprieties des 
cycles de branches (section 4 . chapitre IV). on voit quessentiellement le 
mtime raisonnement sapplique ici encore. 

Remarquons maintenant le fait suivant: 

// existe toujonrs un nombre infini des joints fixes hyperbohques de 
p (A = 1, 2, 3 ...). Sil existe un sent point fixe elliptique, il en existe un nom¬ 
bre infini. 

En effet, sil existe un seul point fixe elliptique nous savons ctej* qu il 
doit exister un nombre infini des points fixes de T* dans le voisinage im- 
mddiat, du type elliptique aussi bien que du type hyperbolique. Mais dans 
le cas contraire il existe au moins un point fixe hyperbolique selon notre 
hypothese ( 3 ), dont le voisinage dtendu contiendra un nombre infini des 
autres points fixes de T‘(A: = t,2, ...)• 


640 



106 


O. D. BIRKHOFF 


3 . Relations entre elles des solutions asymptotiques. 


Les faits suivants sont importants aussi: 

|) /. I,. t. soil S un ensemble ferme simplement connexe de points de S t qui 
contient an mains un faint P apparUnant d un ensemble asymptotiquc dun point 
fixe. Supposons que le voisinage im media/ de I' en S ne sot/ pas mu par 1 
dun sent ensemble asymptotiquc. Alors ce voisinage de P en S contundra un 
nambre infini de points asymptotiques. De plus. S contundra dans le meme voi¬ 
sinage un nombre infini des points asymptotiques qui appart.ennent a l ensemble 
E . on F.„ relatif a un point fixe quelconque. et les images successives de ce 
voisinage de P en S par T ou par T * seront partout denses dans S,. 

Si I’ est un point fixe (elliptique ou hyperbolique) il est Evident que S 
sera coup* un nombre infini de fois par l'ensemble E. + E. de ce point, 
dans le voisinage iiWdiat de P. Si P est un autre point de E., par extra- 
pie les transformations T\T\... feront tendre P asymptotiquement vers le 
point fixe correspondant. Si en meme temps l'ensemble S tend umformdment 
vers le mime point, S doit etre une partie de E.. En ce cas la conclusion 
inoncie a lieu. Dans tout autre cas il est Evident que les images T (S) doivent 
couper une des branches E. ou E. relatives 4 ce point, un nombre infini de 
fois, done il est evident que S contient un nombre infini des points de E. ou 


et la meme conclusion subsiste encore. 

’ Si S contient deux points d un tel E. relatif 4 un point fixe nous pou- 
vons computer la demonstration de la maniere suivante. L'ensemble + ± 
doit inclure une aire (ou diviser S, par une frontiire), done tout autre en¬ 
semble E*. doit couper E. + S le long de S un nombre infini de lo.s. Done 
le seul cas qui reste est celui oil S n'est coupe que par les points des en¬ 
sembles E„ et par consequent contient deux points d un Eo, ce qui est 1 m- 

possible par le m£me raisonnement. 

Dans ces circonstances il est Evident que S et ses images successives 


par T ou par T‘* seront partout denses. 

En considerant les ensembles S qui appartiennent 4 un ensemble asymp- 

totique nous arrivons au r^sultat suivant: 

D. /. A. ». soil S une partie fermee (simplement) connexe dun ensemble 
asymptotique E« (ou E„) qui contietit deux points P et Q qui appartiennent a E'- et 
E"v» (ou E'« et E"«) respectivement. Alors S contiendra dans le voisinagi ? tmme- 
diat de P et de Q un nombre in/ini de points qui appartiennent a un E'"- (E'"«) 
quelconque, et ses images successives par T sont partout denses dans S t . 


641 




NOUVBLLBS RBCHBRCHES SOR LB3 STSTEMB8 DTNAM1QPBS 


107 


En effet consid^rons les images successives de S par T, T*.Les points 

P et 0 tendront alors vers les points fixes I" et O’ correspondants. Done 
ses images seront couples par les ensembles E"'» quelconques puisque ceux- 
ci prfnetrent tons dans les voisinages de V et O' sans couper les ensembles 
E'w ou E"*>. 


4 . Les solutions isomorphes. 

Definissons [ensemble X Q pour tout point 0 de S, comme [ensemble des 
points accessible! de 0 (*) sans guon rencontre deux branches ad,acentes part,- 
culilres non-doubles E. et E- issues cCun point Jixe hyperbolique P (*). 

Pour rendre visible la structure de ces ensembles , imaginons que E« 
et li- soient prolong^es progressivement. Elies diviseront S, en des regions 
avec des aires de plus en plus petites. Pour un point Q qui reste 4 lintiS- 
rieur de ces aires, lensemble 2 q est lensemble lim.te de 1 aire correspqn- 
dante et done un ensemble fermd et simplement connexe qui peut se rdduire 
au seul point Q. Pour tout autre point 0 situd sur E« (ou E u ) sans fitre situe 
4 un point homocline, l ensemble 2 0 doit se riiduire 4 un segment de E. 
(ou E-) ou 4 Q. En effet si Sq s'^tend au del4 de E., cet ensemble aura un 
sous-ensemble l\S qui joint un point R de E. 4 un point S hors de E. {voir 
la figure ci-jointe). Mais selon les r<Ssultats dej4 obtenus SR contiendrait 



dans ce cas des points de E« ou E_ dans le voisinage de R, ce qui n est 
pas possible selon notre definition de S Q . 

Pour un tel ensemble 2 0 les deux points qui le terminent ne peuvent 
pas appartenir 4 E- (ou E.), puisque autrement il existerait des points E. 
(ou E^) 4 Hntlrieur de 


(') Le long d'un ensemble ferm£ connexe a Q. 

(*) Comme nous avons dit, un tel point P existe toujours. 

Serie II, Vol. I- 


642 




108 


O. D. B1RKHOFF 


De la m&mc maniere on conclut que pour un point d intersection de E« 
et E., soit * un point homocline, soit au point fixe, on aura n^cessairement 

29 "si 0 appartient * une branche double on voit que tout l’ensemble des 
branches doubles connexe 4 cette branche doit £tre consider* comme 1 en¬ 
semble 2o correspondant. T ., 

La definition rend evident aussi le fait que les transformations 1 et 

transformed les ensembles 2g en eux-m£mes: 

T (2 q ) — 2 Tt Q,; T-‘ (2g) = St- 1 eg, • 

Deux ensembles 2 Q et 2 R sont ou entierement differents ou coincident. 

Une autre propriete fondamentale est le fait que 2g tie depend pas ou 
des ensembles E*, E« adjacentes ou du point fixe I* que Con choisit . Pour voir 
cela considers en premier lieu un point 0 qui nappartient a aucun en¬ 
semble E. ou Ew rattache * un point fixe hyperbolique. Je dis que si 2 0 est 
defini relatif * un point fixe P et 2"g est defini relatif \ I*", tout point de 
To doit appartenir i 2"g et inversement. Autrement il existe un point K 
de 2 'q qui est un point de E". ou E"-. Mais scion un rdsulut obtenu au- 
dessus il existe alors un nombre infini des points de E, et E- dans le voi 
sina^e immddiat de H, cc qui contredit la definition mime de 2'q. 

En deuxieme lieu supposons que Q soit situe sur E *. En ce cas 2g se 
reduit ou i 0 ou k un segment de E'. qui ne contient aucun point appar- 
tenant * un E"« ou E"w quelconque. Autrement il existerait des points de 
E'. ou E'w dans le voisinage. On conclut done que 2'g doit etre un sous- 
ensemble de 2"g dans ce cas. si 2' Q et 2*g ne sont pas identiques. Mais on 
voit immediatement que dans ce cas 2*'g sera coupe un nombre infini de 
fois par E'. ou E'«, done aussi par E". ou E"«, ce qui contredit la definition 

de 2q . 

En troisieme lieu nous observons que, si Q est un point qui appartient 
i une branche double, 2g est evidemment independant du point fixe choist. 

Le mfcme raisonnement nous montre que 2g ne peut pas contenir un 
point asymptotique rattache a un point elliptique sans se reduire * un sous- 
ensemble simplement connexe de lensemble E, ou Eu. correspondant. 

Nous pouvons maintenant resumer nos resultats de la maniere suivante: 

D /. h. i. a chaque point Q correspond un seal ensemble 2g fer me, simp le- 
merit connexe , qui contient U point Q. Deux ensembles 2g on coincident ou sont 
sans point en commun. Us ensembles 2g sont transformIs nitre eux par T et 
T 1 . /Is sont independants du point fixe hyperbolique et de scs branches adja¬ 
centes F, et E« (non-doubles) que Con choisit . En effet . si 0 nappartient a 


643 



NOUVELLF.S RECHERC1IES SUR LBS SYSTEMES DYNAMIQl' 


199 


aucun ensemble asymptotique. Eg est rensemble maximum accessible de Q sans 
an on rencontre nn ensemble asymptotique E. on E. qnelconqne, st V appartunt 
a nn sent ensemble asymptotique E. ou E-, Eg « reduit b un segment ferme 
de E. on Ew respectivement qui ne cont.eiU aucun point d un autre ensemble 
asymptotique: si Q apparent a deux ensembles asymptotiques E. et E„, on a 
So = o sauf le cas oil ces ensembles se reduisent a une branche double rattachee 
a deux points fixes hrperboliques: mais mime si le point 0 appartunt a deux 
ensembles asymptotiques E. et E. rattaches a deux points fixes elhptiqius. Eg « 
reduit a un ensemble helerocline en common de cel E. et E- qui ne contient 
aucun point dun autre E, «« E~ ; finalemenl s, Q appartunt a un ensemble 
connexe des points fixes et de brandies doubles ,qui pent se riduire b un point 
periodique 0;. Eg est prccisement cet ensemble. 

II y a une autre propria fondamentale que nous exprimons de la fa 9 on 

suivante. 

D. I. h. i. si le point O teud vers 0.. I'ensemble Eg tend en meme temps 

vers 2 q 0 .. , . 

Cette propritit^ nous montre que les ensembles 2 q sent des ensembles 

dcmi-continus de lau-dessus (upper semi-continuous sets) suivant les defini¬ 
tions de R. L. Moore (‘) et iouissent done des propridtes connues de tels 

ensembles. , 

Pour ddmontrer cette propriety supposons en premier lieu que U 0 n ap- 
partienne 4 aucun ensemble asymptotique Les ensembles E« et E» cor- 



respondants de la definition entourent U. (voir la figure ci-jointe) dans le 
voisinage immddiat, puisque autrement Eg. serait un ensemble encore 
plus grand. 


(•) Voir son Livre, Foundations of Point Set Theory. New York (1930). 



200 


G, D. BIRKIIOFF 


Mais l'ensemble 2g ne pent pas traverser !a frontiere ainsi obtenu, s. 0 
est dans le voisina K e de Zq„, 4 cause de la definition meme de 2g . On vo. 
done que doit s'approcher de 2 q c en ce cas. 

Si 0 o appartient a un ensemble asymptotique, 2 q 0 peut etre en our 
encore de la meme maniere, d ot. Ton obtient la mSme conclusion. 

Pour des raisons bien evidentes nous appelons 1 isomorphes les solu¬ 
tions d un tel ensemble 2g . 


5 . Les solutions r^currentes r^guliferes. 

Puisque tout point homocline est un point limite des points periodiques 
nous obtenons le rdsultat suivant en consid^rant la figure prec^dente. 

I). /. h. i. la totalitc des points periodiques est par tout dense par rapportaux 
ensemble . En particular, si Con a toujours = les points periodiques 

sont partout denses dans le sens ordinaire . 

L'intdret de ce r^sultat consiste en ce que le cas = 0 est probable- 

ment le cas g^ndral. ... 

Passons maintenant a la consideration gdn^rale des points ^currents 
mais non-p^riodiques. Appelons un point ‘ recurrent r^gulier * si pour chaque 
point 0 de l’ensemble des points limites onaSg-Q. Nous avons vu que 
2 r, == O pour tout point p^riodique (*> 

|) V /,. tout ensemble recurrent regulier R poss'ede des ensemble <i et o> 
asympto/iques qui sont fermis et connexe it C ensemble recurrent luimeme. 



Pour le demontrer, nous considdrons une petite region o autour dun 
point quelconque Q de l’ensemble recurrent R. Avec I'teration de T les ima¬ 
ges de la rdgion o ne peuvent pas rester dans le voisinage S de R, quoi- 
qu’elles contiennent toujours des points de R; autrement la transitivity n aurait 


(•) Nous pourrions mime montrer sans difficult* qu’il y a des points quasi-ptrtodiques qui 
sont r*guliers. 


645 



201 


NOOVELLBS RBCHBRCHBS SD R LBS SYSTfeMBS DYNAMIQUE3 

pas lieu. Done apres k iterations, une des images de a doit sortir d un voi- 
sinage donn^e de R (voir la figure). On peut done mener de Q ,k ' 4 P'*’ une 
courbe continue Q ,k > P* k \ telle que P“» se trouve sur la frontiere du voisinage 
donne, tandis que tout Parc 0“’ P ,M ainsi que ses images par T-» jusqu’4 la 
k° mt ne sortent pas en dehors de cet voisinage. 

En prenant o de plus en plus petite, on obtient des arcs 0“’ P u ’ pour 
Iesquels k croit sans limite, et Q* appartient toujours 4 R, tandis que P* se 
trouve sur la frontiere du voisinage donne. En consequence, il doit exister 
un certain ensemble limite connexe, QP, qui s'etend dun point 0 de R jusqu’4 
un point P de cette frontiere telle que QP et toutes ses images par Iteration 
indefinie de T“' restent dans cet voisinage. 

Je dis que cet ensemble QP est un ensemble a asymptotique rattache 

4 1'ensemble recurrent R. 

En effet, dans le cas contraire, nous pourrions trouver une suite croissante 
k .... de valeurs de k , et des points correspondants (Q), T-*-(Q), ... 
tels que les points T'*'(P) correspondants se trouvent en dehors dun petit 
voisinage defini de R, tandis que tous les points de T-*/(QP) se trouvent dans 
le voisinage donne. On voit que Pare T-^(QP) doit rester dans ce voisinage 
pendant k, iterations de T aussi bien que par Iteration indefinie de T-». De 
cette manure on obtiendrait un second ensemble Q* P*, ou P* se trouve en 
dehors de R et 0* appartient 4 R, qui reste tout entter dans le voisinage 
donne, quand on repete T ou T”' indefiniment. 

Mais la condition 2 q = 0 qui s’applique au point O* de 0* P* nous ga- 
rantit qu’il existe des points E. ou Ew de Q*P* rattache 4 tout point P erio- 
dique. Evidemment cela nous conduirait 4 une contradiction en prenant un 
voisinage de R suffisamment petit. 

Par consequent les images de QP par T~* doivent s'approcher de P uni- 
formement, comme nous voulons demontrer. 

Done il existe des ensembles asymptotiques connexes « et co rattaches 
4 tout ensemble recurrent regulier donne, qui splendent au moins jusqu*4 
une certaine distance de cet ensemble. Mais un tel ensemble a asympto¬ 
tique ne peut pas couper une branche E a issue de n importe quel point 
periodique. Il doit done couper tous les ensembles Ew non-doubles issus 
de n’importe quel point periodique, et s’etend done dans chaque voisinage 
de tout point periodique sans couper les branches E* issues du m£me point. 
De la meme fagon une branche Ew rattachee 4 1’ensemble recurrent regu¬ 
lier R s’etend dans chaque voisinage du meme point sans couper les bran¬ 
ches E, issues de ce point. Done ces ensembles a et co asymptotiques rat¬ 
taches 4 un ensemble recurrent regulier doivent se couper un nombre infini 
de fois. 


646 



O. l>. BIRKHOFH 


202 

Je dis de plus que chaque point de R est entoun* dans son voisinage 
par ses ensembles asymptotiques. Autrement nous pourrions trouver un en¬ 
semble 2 fermei et connexe k R qui ne contient aucun de ces points asymptoti¬ 
ques. Mais 2 sera certainement coupti un nombre infini de fois par tous les 
ensembles E« et E- puisque R est recurrent r^gulier. Done 2 sera coup£ 

aussi par les ensembles asymptotiques de R. 

De cette maniere on voit (d. I. h. i.) que ces nouveaux ensembles asymptoti¬ 
ques E.* et E<-* rattaches a un ensemble recurrent rigulier R (en particular a 
tout point epiriodique) soul siparement par tout (lenses dans S, et coupent mu¬ 
niment souvent tout ensemble E« ou E« ou mime E-* ou E«* respectivement. e 
plus, E«* et E«* entourent chaque point de R dans son voisinage immedial, et servent 
dont aussi bien que les ensembles E. el E- rattaches aux points fixes hyfier- 
boliques pour definir les ensembles 2 Q . 

Nous nc pousserons pas plus loin ici l*4tude de la structure dcs ensem¬ 
bles ^currents r^guliers. 


6. Les solutions r^currentes irr^gullferes. 

Consid^rons maintenant un ensemble recurrent R pour lequel on n'a 
pas 2 q « 0 pour tout point de R (cas irr^gulier). 

L’ensemble R' lorm<* par tous les ensembles 2 q , ou Q appartient k R, 
sera appete 4 l’extension de R *. £videmment l’extension de R est un ensemble 
ferm<£ qui est transform^ en lui-mSme par T. Si R est rigulier on a R = R. 

Lextension R' peut ne contenir quun seul ensemble R recurrent. 
Plus g^n^ralement R' peut contenir d autres ensembles rdcurrents, R„ R„ ••• • 
D^montrons le fait suivant: 

D. 1. h i. si textension R' d'un ensemble recurrent R contient les ensembles 
recurrents R„ R„ ..., on aura toujours R,'= R/= ... R'. 

Pour d^montrer la relation typique R\ = R', nous observons en premier 
lieu que. si I* appartient i R et si I’, de 2 P appartient a R, on aura aussi 
2 p = 2p, , 2 t ,P) = 2r<p,i, • On voit ainsi que tout ensemble 2q rattach^ 

a R doit contenir un point qui appartient a R. En effet, lorsque l n (P) tend 
vers un point quelconque P* de R, lensemble 2 T - ,p, tend ou vers 2p- ou vers 
un sous-ensemble de 2 p- .. Observons aussi que dans le cas que nous con- 
sid^rons (R non p^riodique) aucun des ensembles R„R„... ne peut etre 
p^riodique; au cas contraire nous aurions 2 q, = Q/ pour n importe quel point 
Qz de lensemble R,* corrcspondant, et done R doit £tre p^riodique, ce qui 
n’est pas possible. 


647 




203 


NOOVELLES BECHEBCHBS SOB LES SVSTEMES DVW*MIQOE9 


Supposons maintenant que k croit de fa C on que T*(P) tend vers un point 
arbitraire P» de R. Evidemment les indices * peuvent etre choisis tels que 
T*(P,) s'approche en meme temps d'un point limite P,* de R,- Mais en ce 
cas Stmpi doit tendre ou vers 2p- ou vers une partie de cet ensemble. Done 
2,- contiendra certainement le point V* de H, aussi bien que le po.nt P* de R 
Nous voyons done que tout ensemble Z 0 rattachtf a R doit contemr un 
point 0. de R, avec 2 0 = 2 0l . Inversement chaque ensemble 2 0l rattachtf 
4 R, doit eontenir un point Q de R. II s ensuit que R = R.\ ce que nous vou- 


Ions ctemontrer. odd a D , 

En vue de ce rtsultat tons les ensembles ^currents H, K t ,... de n 

sont irr^guliers. 

II doit tftre remarqutf que les ensembles R 0 = R,R. sont toujours 

difftfrents, de (a C on que la distance entre deux points quelconques de R, et 
R, (,=£,) excede une quantity 6<,>0. Done si l'ensemblc R' contient plus 
d'un seul ensemble recurrent ordinaire, tous les 2 q doivent avoir un diametre 
plus grand que d>0. 

Les ensembles rtfeurrents R, R„ ... associtfs de cette mamere seront ap- 
pelds 1 isomorphes ' (*)• Afin de justifier cette terminologie observons quun 
ensemble 2„ quelconque de cette totalittf a toujours des reprtfsentants de 

tous les ensembles rtfeurrents R, R.. On tftablit done un isomorphisme 

entre les sous-ensembles fermtfs dc R, de R„ de R,. etc. 

A beaucoup d'tfgards un tel ensemble de mouvements rtfeurrents joue un 
role collectif. Appelons done un tel ensemble ‘ 2-rtfcurrent . 

Supposons maintenant que nous traitons les ensembles Q* = 2 q comme 
les iUmtnts au lieu des points Q. Le ‘ voisinage immtfdiat' d'un de ces tfltf- 
ments est un voisinage immtfdiat ordinaire de £q„, puisque tout 2„ dont un 
point est situtf dans ce voisinage ordinaire de S Qo est situtf entierement dans 
tin tel voisinage. Si nous formons alors Us ensembUs 2 Q * relatifs it de tels 
elements Q*. nous trouverons immtfdiatement la relation fondamentale 



puisque tout Q*=2 q est entourtf dans son voisinage immtfdiat par les en- 
sembles E* et E«. 

De cette maniere nous obtenons le r^sultat suivant: 

D. I. h. i. tout ensemble recurretU R c definit un ensemble des elements 2q qui 
e ,t 2- recurrent . Chacun de ees contient au mains un point de tous les en 


(•) Nous ne voulons pas indiquer par cette notation que les R. sont nicessairement d* 
nonibrable. 


648 



o. D. BIRKIIOFF 


201 

sembles recurrents isomor p/ies R 0 , R,, R f ,... et ne contient pas d'autres points 
recur rents. It existe toujours deux ensembles E>* et E«* ties elements 2^* fermes 
et connexes a fensemble 2 -recurrent, lets que E,* el E«* tendent asymptoti- 
quement vers cet ensemble 2- recurrent . De plus E>* et E-* doivent couper nn 
nombre infini de fois tout autre ensemble E«* et E,* respectivemcnt rattache a 
un point periodique quelconque ou a test ensemble 2- recurrent, et ils sont sepa- 
riment par tout denses dans S,. 

La structure des parties constituantes 2 Q qui entre dans un ensemble 
S-r^current doit etre d’un caractere tout particulier comme indique le fait 
suivant: 

Dans tout ensemble 2- recurrent avec des ensembles recurrents isoles R, il 
existe toujours des ensembles fermes connexes a R., qui tend ou u ou o> quasi- 
asymptotiquement vers cet ensemble R,. 

Nous voulons dire par cela (i) qu'un tel ensemble lerm<* connexe reste 
toujours dans un petit voisinage dc R. quand T -1 (ou T) est r^pet^e indefi- 
nirnent et ( 2 ) que tout point I* de cet ensemble se trouve transforme au 
moins une fois en un voisinage arbitralremeut petit de 11/ unilorm^ment sou- 
vent avec la r^p^tition de T~' (ou T). 

La demonstration est presque immediate. Commengons par une petite 
region o autour d’un point Q d’un tel ensemble recurrent isoie, R/, pour le- 
quel 2 q^= 0 existe alors une partie fermee connexe de 2q qui s’etend 
de Q au bord de cette region. Si cette partie reste dans le voisinage de R/ 
pendant ('iteration indefinie de T, elle constitue un ensemble 10 quasi-aymptoti- 
que rattache cl II., parce que un point quelconque de ce voisinage dcit s’ap 
procher d’un ensemble recurrent particulier uniformement souvent('), et cet 
ensemble recurrent ne peut etre aucun autre que 11/ lui-meme. Mais si 
cette partie, quelque petite qu'elle soit, ne reste pas dans le voisinage im- 
mediat de II, ses images successives par T doit s'approcher d’un ensemble 
ferme connexe a 11. qui reste dans ce voisinage quand T“‘ est indefiniment 
rep^tee; cette ensemble doit constituer alors un ensemble « quasi-asympto- 
tique rattache .a 11/. 

Un cas analogue, le plus simple possible, est celui de deux points fixes 
connexes par une branche hypercontinue double. En ce cas les deux points 
fixes correspondent a deux ensembles recurrents isoles II, et R # qui sont 
isomorphes et la branche double correspond a un ensemble u quasi-asympto- 
tique de R, (par exemple) et w quasi asymptotique de 11,. 


(') Selon une proprietc fondamentale des ensembles rfecurrents. 


649 



NOOVELLBS RECHBRCHBS SDR LBS SYSTEMES DYNAMIQOES 


205 


7 . La solution g£n£rale. 

Dans ce qui precede nous avons ddmontrd l’existence d’une diversity 
extraordinaire de types de solutions spdciales, c’est-4-dire, qui correspondent 
k des points speciaux de S, dont la suite d'images successives par T et par T“* 
ne couvre pas S t densement toutes les deux. Mais dans le cas transitif que 
nous consid^rons on peut d^montrer imm^diatement qu'il existe des points 
g^n^raux de S, aussi ('). Nous voulons maintenant consider ces points 
gdn^raux. 

I). /. A. 1 . 1'ensemble des points periodiques el de leurs ensembles asymptotiques 
const Hue un ensemble de points speciaux de mesure zero. Si 1 'hypo these de 
transitivite ordinaire est remplacee par celle de la * transitivite metrique ' (*) 
rensemble de lous les points speciaux est de mesure nulle. 

Pour d^montrer ces faits ^vidents nous observons premierement que les 
points periodiques forment un ensemble drfnombrable, done il sulfit de de- 
montrer par exemple que l’ensemble a asymptotique de tout point p^riodi- 
que doit etre de mesure zero. Mais cela est evident puisque une partie finie 
quelconque de cet ensemble tend uniformement vers le point fixe sans chan¬ 
ger sa mesure. 

En deuxieme lieu nous observons que la totality des points speciaux (*) 
pendant la repetition de T est invaria.ite par T, done de mesure nulle ou de 
celle de S, selon l'hypothese de la transitivite metrique. Si cette mesure etait 
celle de S # nous pourrions raisonner de la maniere suivante. Tout point I* 
qui est special par T n entre pas dans le voisinage d’un point particulier 
de S : . Choisissons maintenant de petites regions C/ (i = 1, 2, .../) avec des 
aires au moins o > 0 qui couvrent S a . On voit done que quelques-uns des 
ensembles S/ composes des points qui n'entrent jamais dans C/ avec la re¬ 
petition de T doivent avoir une mesure positive. Mais ces ensembles sont 
invariants par T ( 4 ). Par consequent un des ensembles S, devrait avoir la 


(•) Voir S. T.. 

(*) Voir un Article de P. A. Smith et de moi mime: « Structure Analysis of Surface 
Transformations », American Journal of Mathematics, t. 7 (1928). 

O') Cette totality est mesurable, puisque (’ensemble des points qui n’entrent pas avec la 
r*p*tition de T (ou T' 1 ) dans une region ouverte est toujours ferm*. 

(*) Un tel ensemble est transform* en une partie de lui-m*me par T, done essentiellemeot 
en lui-m(me puisque T est conservative. 


650 



206 


Q. D. BIRKHOFF 


mesure de S t selon l’hypothese de la transitivity mytrique. Mais cela est 
impossible puisque cette mesure est au plus celle de S, moins a. 

L’emploi du thyoreme ergodique (') nous donne Information suivante: 

D. /. A. t\, sauf pour un ensemble de mesure null r lout point de S, est 
transforme par T, T*. ... et par T\ T *, ... en une suite des points donl la 
densite de distribution sur tout ensemble mesurable tend vers une limite. Si la 
trans itivite metr/que a lieu, la limite de eette dens ill sera precisement propor- 

tionnelle k 0 oil \ J' Q (w, v) du dv designe l'invariant integral de T. 

Evidemment ce rysultat ne rentre pas dans le domaine purement topo- 
logique. 

La considyration effective de tous les points de S, prysente des diffi- 
cultys essentielles. A vrai dire il y a un seul cas oil Ton a discuty d’une 
maniere presque satisfaisante un systeme dynamique transitif de deux degrys 
de liberty, a savoir celui des gyodysiques sur une surface fermye ordinaire 
k courbure nygative dyj& mentionye. Une telle discussion est possible en ce 
cas pour les deux raisons suivantes: (i) il n'cxiste pas de solutions e-pyrio- 
diques, ce qui permet une circulation tres rapide des points de S, puisque 
toutes les solutions pyriodiques sont formellement (et done actuellement) 
instables; ( 2 ) il ny a qu’un seul invariant essentiel qui entre, k savoir le 
genre ;> de la surface. II en rysulte qu'on peut trouver un point I* de S,, tel 
que la suite des points 

... T-(P). P, T (P), T* (P) ... 


ne differe pas beaucoup d'une syrie de suites finies arbitrairement donnyes 

... r... T(P..). ... T*-. <P.,) . P,, T (P.) ,... T“< (P,), 

T(P.) ... T**(P.)..., 

ou P,, P.„ P„ P-„ • • • sont des points quelconques et a n a„. .. sont des entiers 
suffisamments grands. Plus exactement les points T* (P) correspondants sont 
pres des points correspondants de ces suites sauf aux buts. 


(') J'ai r£ussi a d4montrer ce ihioiime etonnant dans un Article Proof of the Ergodic 
Theorem. Proceedings of the National Academy of Sciences, t. 17 (1931). Voir aussi une Note 
historique par B. Koopman et moi-nfeme. Recent Contributions to the Ergodic Theorem. Pro¬ 
ceedings of the National Academy of Sciences, t. 17 (1931). Cette Note contient des references 
aux travaux anterieurs de B. Koopman, J. v. Neumann et E. Hopk. C’est settlement le tlfeo- 
feme ergodique qui donue le resultat enonce au-dessus. 


651 



NOUVELLES RECHBRCHES SUR LES SYSTEMES DYNAMIQUES 207 

Voila une pro pride fondamcnfale qualitative qui rcsulte tie la circulation 
rapiile ties points de S, enlre eux cl qui ne pent avoir lieu s'il existe un sent 
point periodique elliptique. 

A vrai dire, toutes les ressources de l’Analyse moderne n'ont pas suffi 
jusqu’ici a ddmontrer que les images dun point dans le voisinage immediat 
d un point pdriodique elliptique pcuvent en sortir (probleme de la stability). 
La seule chose certaine, c'est que la circulation des points doit £tre extre* 
inement lente en ce cas. 

Cette circonstance nous montre conclusivement qu’on ne peut pas espdrer 
de choisir presque arbitrairement les differentcs parties de la suite des points 

... T-'(l'), l\ T(P), T* (P), ... 

En eftet, si l un de ces points est situd dans le voisinage d un point fixe ellip¬ 
tique, ses images successives y resteront pendant un nombre dnorme d ite¬ 
rations de T et de T*\ 

En outre, dans le cas que nous considdrons il existe un nombre infini 
d invariants au lieu d un soul. Par consequent i! y a une varidtd correspon- 
dante des systemes dynamiques et c*est seulement apres une dtude appro* 
fondie des invariants dans chaquc cas particulier qu’on puisse espdrer 
determiner I'allure gendral des solutions possibles. Nous introduisons dans 
la section suivante une methodc algorithmique qui permet d'dtudier syste- 
matiquement les invariants et I'allure de toutes les solutions. 

Malgrd ces difficultes cssentielles de complexite nous pouvons formuler 
un resultat de quelque portee: 

I). /. h. i., soil 

• • • » C-,» ^'i * ^*«»• • • 

une suite double went injinie arbitraire des ensembles periodiques, ricur rents 
rcguliers on 2 -ricur rents : et soil 

• • • , [D_i, i] , [Di., [D,.,] , . . . 

une suite corrcspondantc dont (D,> consiste en un nombre /ini de tels ensembles 
autre que Solent <f,> f) > 0 des fonetions continues qui correspondent aux D,> 
et qui dccroissent vers zero avec f. Alors on peut trouver dans le voisinage d un 

point donne de S, un point P et des va/eurs de f,, r t . f.,, e. t , ... aussi pe- 

tites qtt on le vent. Idles que les images successives du point P par T entrent 
dans /V, zois inage tie 0, (»', < f,) avant dalteindre le <f,- (f',) voisinage de 
quelqiiun des l),>, puis iIs entrent dans l’t t voisinage de C, (e' f < avant d'at- 


652 



208 


O. D. BIRKHOFF 


teindre le <p,y (e' t ) voisinage de quelquun des D.„ et ainsi de suiteel en mente 
temps Us images successives de I* par T ' 1 entrent dans Us e\ voisinage e'., < e.,) 
(U C_, avant d'atteindre U cp.,, (e\,) de quetqu'un des D_,/, el ainsi de suite 
indefiniment . 

Nous pouvons d^montrer ce r^sultat facilement en employant quelques 
faits ^tablis auparavant. 

Puisque les ensembles a et o> asymptotiques rattachds a C, sont partout 
denses, il existe un point 0 appartenant & l’ensemble to asymptotique dans 
le voisinage consid£r£ d’un point arbitraire de S f . Par consequent, en choisis- 
sant une valeur de e\ < e, assez petite et un point Q, assez pres de Q, nous 
serons certains que la premiere des images entre dans le e, (<e.) voisinage 
de C, avant d’atteindre le <y tJ («'.) voisinage de quelquun des !>,/. En effet, 
quand 0 , tend vers Q 0 la suite dimages s’approche de plus en plus dune 
suite to asymptotique vers C,. Tous les points suffisamment pres de ce point 0, 
auront la mime propri^td. II s’ensuit qu’on peut choisir un point voisin, Q x , 
pr^s de I’ensemble a asymptotique de C.„ tel que 0. n'a pas seulemcnt la 
propria d^j& obtenue de 0» mais que meme les images de Q» par T*‘ 
entrent dans IV., (< «-,) voisinage de C., avant d’atteindre le <p.,< («'.,) voi¬ 
sinage de quelqu’un des D.,/. En proc^dant ainsi inddfinimcnt on obtient un 
point I* = lim 0 * de l’espece indiqu^e. 

On voit done que Ton peut prescrire un peu arbitrairement la succession 
des approches de P aux voisinages immddiats des ensembles p^riodiques 
^currents rdguliers et 2 -^currents. 

Puisque, scion la th^orie g^ntirale des ensembles ^currents, les images 
d’un point quelconque doivent s'approcher des ensembles rdcurrents, on voit 
que ce r^sultat concerne la totality des points. 


8 . La signature. 

Pour aller plus loin il semble done n^cessaire d’identifier les invariants 
topologiques du systeme consid£r£. 

D. /. h. i., le reseau sur S, ferme par deux branches adjacentcs Ea et Eu, 
(Cun point fixe hyper bo lique P caracterise le systeme dynamique au point de vue 
topologique, au moins jusqua un certain point. En particular, si le systeme 
est regulier (tel que 2 q se reduit a <) toufours), le reseau le caracterise com - 
plctement. 

(Jn tel reseau sur S, est invariant par la transformation T associee, done 
pour suivre les images d'un point on n'a que suivre les images des points et 
des arcs de ce reseau avec literation de cette transformation. 


653 



NODVELLE9 RBCHBRCHBS SDR LBS 8YSTEME3 DYNAMIQDBS 


209 


Ces faits sont presque dvidents. En effet, dans ce chapitre nous avons 
^tudid le syst£me dynamique donn£ au moyen d un tel r^seau qui consti- 
tue pour ainsi dire une espece de symbol* b deux dimensions conforme aux 
besoins th^oriques. 

Nous allons montrer comment on peut donner k cette idde g^n^rale une 
forme plus precise. 

Soit P le point fixe hyperbolique et P.Q-P un lacet 414mentaire ferm4 
dun arc dune branche <o asymptotique P«Q et d’une branche to asymptoti- 
que Q«P qui ne se rencontrent qu’en un point homocline Q (voir la figure). 



Commengons par associer d’une maniere biunivoque et continue les 
points L de cette o> branche avec les nombres rdels de fagon que, si le nom- 
bre X est assoc i£ avec un point L sur cette branche, le nombre X + est 
associd avec son image L, par T, tout en laisant correspondre Q £ 0. Cette 
correspondance n'est pas uniquement d^finie, mais la correspondance la plus 
g^ndrale de ce type se d^duit d’une correspondance quelconque moyennant 
une transformation 

X = X-f <p(X) 

ou cp(X) est, continue et pdriodique de p^riode 1 avec tp(0) = 0, et ou 
X -f q> (X) croit avec X. De la m£me maniere nous pouvons associer avec le 
point M de la a branche des nombres p analogues. 

Done l’arc 0- P correspond aux valeurs de X ^ 0, tandis que les images 
de 0 par T, a savoir 0,0*0*, ... correspondent X X = l, 8, ... ; et d’une 
maniere analogue l'arc P« Q, aux valeurs de p ^ 0 tandis que les images de 
0 par T’ 1 , a savoir Q.,, 0-,,... correspondent k p= —1, —2,.... lividem* 
ment il n'y a pas des paires (X, p) telles que X^0 et p^0 sauf(0,0). 

Par la transformation T un tel point (X, p) est transform^ en (X-fl, 
p-f I), et par T' 1 il est transform^ en(X—1, p —1). 11 s'ensuit qu’il n’y 


654 





210 


o. D. BIRKHOPF 


a pas des paires telles que X<i<p, ou » est un entier quelconque, sauf 
pour (t,»)• Hn effet (X — », p — i) serait du type non admis dans le cas 
contraire- 

Le reseau est entiirement specifie par cet ensemble dlnombrab/e des paires 
(X, p) tel que , si (X, p) lui appartient, (Xdfcl, p ± 1) lui appartient aussi.oit 
\ et p sont considerees eomme des paramelres ordonnees arbitraires de piriode I. 

En effet, considErons toutes les paires (X, p) de 1’ensemble, pour les- 
quelles 0 <^X, p^l et les arrangent avec p croissant: 

(0, 0), (X. f p,), . .. , (Xy, py), (1, 1), 

(0 < Pg ... < IV < 1 » X/ < 1) . 


Cet ordre specifie completement l’entrelacement de Q-Q, et 0«0, (voir la 
figure au-dessus). 

II y a ici certaines conditions necessaires entre ces paires pour qu’on 
puisse rEaliser effectivement l'entrelacemcnt. En effet, il faut et il suffit 
pour cela qu’en supprimant successivement des couples adjacentes (c’est-4- 
dire, telles que les valeurs de X et p sont toutes les deux adjacentes), il ne 
reste que trois paires (0, 0), (X,. p,) et ( 1 , 1) ('). Reinarquons que tous 
les croisements de Q®0i avec Q-P doivent avoir lieu entre Q et Q, sur 1'arc 
0«P; autrement 1‘arc P«Q rencontre Q«P en d autres points que Q. 

Puisque les points 0 et Q, sont Equivalents par T et que la courbe P«QwP 
a deux c 6 tEs, on peut reprEsenter les deux arcs entrelacants par un cercle 
qui correspond a Q,a>Q t et par une courbe fermEe sans point double 
qui croise le cercle avec le mEme entrelacement {voir la figure 31 et la 
figure ci-jointe). En modifiant la courbe fermEe continuement sans la faire 



(') Nous ne cousidtrons en ce qui suit que le cas ou S, est du genre 0. Pour le cas d'un 
genre quelconque il faut commencer avec une partie du rtseau qui rend S, simplcment con- 
nexe, au lieu d’employer un lacet simple. 


655 



NOUVELLES RECHERCHES SUR LES SYSTKMES DYNAMIQCBS I’ll 

traverser le centre C du cercle, on pent faire disparaltre lcs points du 
croisement adjacents en paires. sauf pour deux, si l’on considEre Q comme 
un point fixe. 

Nous appelons cette figure au-dessus la signature partielle S t du sys- 
teme dynamique. 

De la meme maniere on peut analyser l’entrelacement de 0«0, avec 
0-0.- Cel a consiste en lentrelacement de 0-0, avec 0-0. dEj& construit, et 
d un emplacement identique de 0,«0, avec 0,-0«. ou Q.«0, doit Etre 
construit de fagon k croiser 0o- 0. convenableihent. Plus prEcisEment il faut 
considErer les paires (X, p) avec 0^p<2 qui peuvent Eire arranges en 
trois groupes, de la maniere suivante: 

(0, 0 ), (X„ p.)-- (V. i*/). 0. 0. 

(1. 1). (*. + 1. H. + I).<V + '• W-t- «). (*• *). 

(V.. I*'.). (V„ .(V*. 

... <*'»<*; XV < l) 

Ici encore une condition nEcessaire dr possibility consiste en le fait qu’on 
peut arriver finalement aux cinq paires (0. 0), (X/, p*), (1, 1), (Xf+1, 
Prf -f- 1 ), (-, 2 ) en supprimant les paires adjacentes. 

Puisque les points 0, 0 t et 0* sont Equivalents par T, on peut reprE* 
senter les deux arcs entrelagants 0 « 0 , et 0 - 0 , par un cercle (pris deux 
fois) qui correspond k 0-0«, et par une courbe fermEe (prise deux fois) sans 
point double qui croise le cercle avec le mEme emplacement que celui de 
Q« 0 , et 0 - 0 , (voir la figure 31 et la figure ci-jointe). 



Nous appelons cette figure la signature partielle S', du systeme donnE. 
II peut se rEduire a S, prEcisEment; plus gEnEralement on l’obtient de S t 
en modifiant quelques-uns des arcs de S, continuement sans changer les 


656 



O. D. BIRKHOFF 


212 

croisement qui entrent dans S t {voir la meme figure) et en traversant la 
coupure CO * une seule fois du coty gauche. 

D'une maniere pareille nous definissons les signatures partielles 
successivcment. En gdndral St s’obtient de S*-i en modifiant les arcs de St- 1 
continuement, ce que Ton parvient a faire en retenant tous les points du 
croisement de St- 1 et en traversant la coupure C 0 pr^cisdment A: — 1 fois 
du cotd gauche. 

La signature complete S du systeme consiste en 1’cnsembie ••• 

de ces signatures partielles, dont elle est pour ainsi dire une esp'ece de 
limite. 

Evidemment cette signature S determine completement l entrelacement 
des branches asymptotiques l*« -x> et P« vs rattach^es au point fixe hyperbo- 
lique 0, et ainsi tous les invariants du systeme correspondant. Far exemple 
1 'invariant le plus simple (autre que le genre de S t ) est le nombre des croi* 
sements qui entrent dans la premiere signature partielle S t . 

Jusqu’ici nous n'avons pas fait usage du fait que I est conservative et 
transitive. Mais au point de vue de Xanatysis situs toutes ces deux propri^t^s 
se trouvent r^fl^t^es dans le fait que les deux branches V* o et V- >o sont 
partout denses sur S f . Pour le r^seau considyry en lui-meme cela ne signifie 
qu’entre deux valeurs quelconques de X ou de \i il y en a une autre, done 
un nombre infini d'autres. 

Par consequent tout are particulier d'une St quelconque doit etre effecti- 
vement modi fie un nombre infini de fois dans tes signatures partielles S*+i, 

Ces conditions de possibility semblent etre suffisantes aussi bien que 
ndeessaires. 

Nous voyons maintenant exactement ce qu’il y a d’arbitraire dans ce 
rdseau. Pour le construire il faut savoir: lordre relatif de X 4 , X t ,... X/ dans 
l'intervalle (0,1); lordre relatif de X',, X' t ,... X'* par rapport k X,,X t ,...X, 
dans (0,1); lordre relatif de X",, X",,X"/ par rapport & X,,X t1 ...X, et 
X'„X' X> dans l'intervalle (0.1); et ainsi de suite. Les signatures par¬ 
tielles S t ,... constituent Instrument gyomytrique qui nous permet de 
voir toutes les conditions d interrelation d'ordre qui sont impos^es sur les 
X/, Xy, X'V, .... 

La conclusion suivante est maintenant dvidente: 

D. /. h. i., deux sysfemes dynamiques reguliers seront topologiquemenl 
equivalents quand i/s admettent la meme signature S et seulement en ce cas. 
Dans le cas irregulier its seront £•equivalents quand i/s admettent la meme 
signature. 


657 




NOUVBLLBS RBCHBRCHBS SDR LBS SYSTBMBS DYNAMIQCB8 


213 


9 . Extensions et generalisations. 

Les hypotheses ( 1 ), ( 2 ), ( 3 ) que nous avons employees dans ce der¬ 
nier chapitre ont ete faites pour fixer l’attention sur les cas les plus im- 
portants d'un systeme dynamique de deux degres de liberte. Faisons quel- 
ques remarques a cet egard: 

Sans l’hypothese ( 1 ) de la transitivite nous aurions dil classifier les 
espices des ensembles connexes de S, qui sont invariants par la transforma¬ 
tion T. Le premier pas serait de definir le « voisinage generalise * d une 
trajectoire quelconque qui se compose de la trajectoire jointe aux trajectoires 
limites de toutes les trajectoires qui entrent dans son voisinage immediat. 
De ce point de vue notre etude d’au-dessus se restreint au cas oil le 
voisinage generalise consiste en S, tout entier. J'ai peu de doute quelle ne 
s’etendra k de tels voisinages generalises d’une traiectoire quelconque dans 
le cas non-transitif. 

L’hypothese ( 2 ) qu’ il existe une surface reguliere de section est proba- 
blement satisfaite par les systemes qui ne different trop d un systeme inte¬ 
grate, comme nous l’avons indiqud. II semble presque certain que les 
systemes qui n'admettent pas une telle surface de section sont plus compli- 
quees en structure. Neanmoins on pourrait definir les ensembles Xq relatif 
k une surface ouverte de section qui existe toujours. 

Done en divisant S, en ses voisinages generalises et en employant les 
surfaces ouvertes de section on pourrait eviter les deux premieres hypo¬ 
theses ( 1 ) et ( 2 ). 

Quant k I’hypothese ( 3 ) il semble assez probable qu’il existe toujours 
des trajectoires A-periodiques avec des trajectoires homoclines, sauf dans 
les cas integrates et les deux cas simples signals plus haut. S’il y a des 
trajectoires periodiques deg^neres, elles n’offrent pas de difficult^ puisqu on 
peut les regarder comme des solutions periodiques irregulieres quand nous 
n'avons pas X v = Q pour ces trajectoires. 

Ces remarques s'appliquent aussi bien b tout problbne non-singulicr du 
troisibme ordre, meme s' it n est pas d'origine dynamique. 

En resume done, toutes les idees generates que nous avons employees 
ici, ainsi que les extensions que nous venons d’indiquer, semblent destinees 
k jouer un role important dans toute theorie definitive des systemes diffe* 
rentielles irreductibles du troisieme ordre dans le domaine reel. Seulement 
la tache de distinguer entre tous les cas possibles (surtout quand il n'existe 

Serie HI. Vol. 1. 15 


658 



214 


o. D. BIRKHOPP 


pas une surface rdguliere de section) prdsente de formidables difficult^ de 
complexity. Ces difficult^ deviennent beaucoup plus grandes encore dans le 
cas des systemes d’ordre n > 3. 

Ndanmoins il faut espdrer que 1'invention de symbolismes convenables 
et la ddcouverte de nouveaux details de structure simplifieront beaucoup 
cette tache, et rendront ainsi possible une thdorie tout 4 fait satisfaisante 4 
l’esprit mathdmatique. 


659 


TABLE DES MATURES 


Introduction 


pao. 

85 


Chapitrb I. 

Los Equations pfafflennes da quatrltmo ordre. 

1. Reduction 4 un systeme du troisieme ordre. 

2. Reduction 4 la forme canonique d’un seul degrt de liberty . 

3. Reduction 4 une transformation ponctuelle T. 

4. Existence des surfaces ouvertes de section. 

5. Un crittrium pour l'existence des surfaces regulieres de section 

6. Un exemple. . • . 

7. Une application. 

8. Remarques .. 


88 

92 

94 

96 

100 

102 

108 

110 


Chapitrb 11. 

Los solutions e-pdrlodlquos ot les solutions volslnes. 

1. Classification generalc des solutions p6riodiques. 

2. Les solutions a et to asymptotiques d'une solution «-p6riodique 

3. La transformation de X a et X*, par .. 

4. Les ensembles X, et X„ 4 branches et sans branches .... 

5. L’interrelation des esembles X a et X w . 

6. Les solutions periodiques voisines. 

7. Extension aux autres solutions e-p6riodiques. 

8. Distribution des solutions e-periodiques. 

9. Sur la possibility de generalisation. 


113 

116 

123 

128 

131 

132 
136 
144 
144 


Chapitbb 111. 

Los solutions byporbollquos ot las solutl. 

1. Les courbes invariantes dans le cas general 

2. Un lemme. 

3. Le voisinage d'une solution A-periodique generale . 


145 

146 
155 


660 























O. D. BIRKHOFF 


216 


4. Extension aux autres cas hyperboliques 

5. Application 4 la continuation analytique 

6. Quelques remarques g6n6rales . 


PAG. 

160 

172 

174 


Chapitrb IV. 

Le voisinage 6tendu des solutions A-pdrlodlques. 

1. Quelques solutions p6riodiques dans le voisinage etendu. 

2. Un lemme. 

3. Etude du voisinage etendu d’une solution A-p6riodique . 

4. Le voisinage etendu d’une cycle des solutions A-p6riodiques . 


175 

178 

179 
185 


Chapitrb V. 

Theorle generate des solutions. 


1. Premier .. 

2. Denslt* des solutions asymptotiques d’une solution pAriodique 

3. Relations entre elles des solutions asymptotiques . 

4. Les solutions Isomorphes. 

5. Les solutions rAcurrentes rtgulitres. 

6. Les solutions rAcurrentes irrAgulieres. 

7. La solution generate. 

8. La signature. 

9. Extensions et generalisations. 


187 

189 

196 

197 
200 
202 
205 
208 
213 


661 

















Reprinted from Joum. de Mathematique , s. 9, Vol. 15, 1936, pp. 339- 
344. 


Note sur la stabilite en Dynamique ; 

Par George D. BIRKHOFF. 


Dans sa forme la plus simple le probleme de la stability en Dyna¬ 
mique se pr^sente de la mani£re suivante. Soit donn£e une transfor¬ 
mation r^elle T des variables u , v : 

p, = «|/( u,v)z= cu -4- dv -+■ ..., 

ok les fonctions <p et <\i sont analytiques k l'origine (o, o) du plan des 
variables u, v et s’ 4 vanouissent en ce point. L’origine est done un 
point fixe de T. Supposons de plus que l'intlgrale double 

Jf 'Q(“. v)dudv, 

ok 

Q ("» *0 =P + 9 " +rv (p?£ o), 

soit invariante par la transformation T. Alors l'equation caractlris- 
tique en p au point fixe (o, o) 

p* — ( a -4- d) p -+- ad — be = o 

sera une Equation r^ciproque puisque ad — bc= 1. Dans le cas 
|n-f-r/|<2 cette Equation aura deux racines e ia et e~ ia (i= — 1) de 
module 1. Supposons de plus que ces racines ne soient pas des racines 
/i Umo de l’unite. Pour r^soudrele probleme de la stability effective pour 
cette transformation T (du type formellement stable) il faut : ou bien 
d£montrer que dans n’importe quel voisinage du point (o, o) il existe 


662 



GEORGE D. BIRKHOFF. 


340 

des points P dont les images successives par T ou T“' sortent k 
l’exl 4 rieur d’un voisinage fixe de (o, o), ou bien demontrer qu’un tel 
voisinage fixe n’existe pas. Dans le premier cas le point (o, o) serait 
instable; dans l’autre cas ce point serait stable. 

Jusqu’ici, malgre les plus grands efforts des geometres, on n’a pas 
pu demontrer quele cas instable puisse avoir lieu, quoique, selon toute 
probability, Testability (tres lenle) ait lieu en general. Nleme si nous 
supposons seulement que T soit de classe C A (') (k fini et positif), il 
semble trys difficile de dymontrer que Tinslabilite puisse avoir lieu. 

D’autre part on sail qu’il exisle en certains cas analytiques des 
regions annulaires d’instabilite dans le voisinage immediat d’un point 
stable ( a ). En cffet dans le cas stable il existe toujours une famille 
infinie ypanouissante de courbes invariantes fermyes autour du point 
fixe; ces courbes sont rectifiables, et chacune d’elles est rencontrye 
par un rayon quelconque issu de (o, o) en un seul point. A u.ne telle 
courbe invariante correspond un « coefficient de rotation ». Il peut 
arriver que tous ces coefficients soient incommensurables avec 211. En 
ce cas toutes les courbes invariantes sont tout a fait distinctes entre 
elles. Les coefficients de rotation sont rangys alors dans le meme 
ordre que ces courbes invariantes. Les rygions annulaires d’instability 
sont les rygions du plan entre deux courbes invariantes successives 
quelconques. 

II y a une analogie ytroite entre le voisinage d’un des bords d’une 
telle region annulaire,et le voisinage annulaire d’un point fixe instable 
(mais formellement stable). Nous voulonsdans cette Notefaire ressortir 
plus clairement cette analogie. Si Ton introduit des coordonnyes 
polaires r, 0 , le point r= o corresponds k une courbe invariante avec 

un coefficient de rotation a tel que ^ soit irrationnel. Soit (3 le coeffi¬ 
cient de rotation qui correspond a une telle courbe invariante k la 
frontiere d’une region d’instabilite. Introduisons les nouvelles 
variables p, © oil p = r 2 —9 = 0 , et oil r== /( 0 ) est Tequation 


(*) Dire qu’une fonclion est de classe C l est dire (par definition) qu'elle pos- 
sede des derivees continues jusqu'au A 1 *® 0 ordre. 

(*) Voir mon Livre Dynamical Systems. 


1 


663 




STABIL1TE EN DYNAMIQUE. 


34 I 

de la courbe invariante. Avec ces variables, T devient une transforma¬ 
tion T* qui est biunivoque et continue dans le voisinage de l’origine 
du plan des variables p, 9, et dont l’origine est un point fixe. Cette 

transformation admet l’integrale invariante jj' Pdpdy. De plus T* 

transforme les directions radiales entre elles sans laisser invariante 
par T* A (k quelconque) aucune direction radiale. Le coefficient de 
rotation p se rattache aussi a cette transformation des directions. 

Done cette transformation biunivoque el continue T* jouit des plus 
importantes proprietes qualitatives de la transformation analylique T. 
Ndanmoins elle est ividemment instable d I’ongine. 

Allons un peu plus loin. 

En supposant que la stabilite ait lieu pourtoutes les transformations 
analytiques T, on obtient une famille de courbes invariantes rectifiables 
correspondent k une transformation T quelconque. 11 semble done 
presque certain, que pour un clioix convenable dcT, on puisse obtenir 
une courbe invariante au bord d'une region annulaire d'instabilile 
pour laquelle le coefficient de rotation £ est un nombre arbitrage. 11 
semble etre certain aussi que ces courbes rectifiables puissent etre de 
classe C* (A arbitrairement grand). Mais le long d’une telle courbe, 
selon un r^sultat Ires important et bien connu de \ 1 . Denjoy, la trans¬ 
formation de la courbe invariante doit etre pour A>2, topologique- 
ment ^quivalente a une rotation ordinaire. 

En d’autres termes il existe une fonction continue et periodique de 
periode 2n, soil <?( 0 ), telle que la fonction 0 - 4 - 9(0) croisse avec 0 et 
satisfasse k liquation fonctionnelle 

0,(d) + 9(d t (e)] = 0 + ?(O)-4-p. 

Ici 0 d^signe la valeur de la coordonnee 0 au point consid^re, tandis 
que 0,(0) designe la valeur de 0 au point transforme. II semble egale- 
ment certain que la fonction continue 9(0) ainsi introduite puisse 
avoir autant de derivees continues qu’on le veut, avec 1 -h 9'(0) > o 
partout. 

En admettant tout celacomme tres vraisemblable, on peut montrer 
sans difficult^ qu’en partant de Thypothese de stabilite on est conduit 
Journ. de Malh. t tome XV. — Kasc. IV, 1936 . 44 


664 



3^2 GEORGE D. BIRKHOFF. 

necessairement k rejeter cette hypothese, au moins pour les transfor¬ 
mations T* de classe C A (k arbitrairement grand) sinon analytiques. 

En effet si Ton remplace la variable 0 par ^ = 0 -+- 9 et la variable r 
par p ou c = /•-— /-'( 0 ), on peut ecrire T de la manierc suivante : 

p,= « s (+)p 

vj/, = v*/ •*-(i -t- p[/»,-4 -b*ip» 

Oil les fonctions A*, B* sont de classe O (/ arbitrairement grand) 
pour c >0 et bornees. 

Puisqu’on peut prendre T elle-meme telle que l'integrale invariante 
soitl'aire ordinaire, la transformation T en des variables modifies con- 

servera aussi I’aire jj dptfy. Done on doit avoir l'identite = 1. 

En substituantet comparant, on voit que «,(^)== 1 dans les equations 
ci-dessus. 

Cherclions maintenant a obtenir une fonction continue X(^) et une 
constante r C) telles que 

*(+ P) - >(+) = M+> - 'm- 

Four cela, devcloppons la fonction &,(<]/) en une s 4 rie de Fourier 

•f • 

*.(■(')= 2 ‘’ e "* ('•=*.)■ 


D’une maniere formelle nous obtenons pour X une serie analogue 
correspondante 





011 n dans la sommation I'. Mats il est bien connu que pour un 
clioix convenable de (3 (par exemple (3 = 271^/2) la valeur absolue 
de r'"' s — 1 est au moins ^ pour n quelconque, K. etant une constante 

positive. II est done visible que la fonction X(i]/) ainsi obtenue aura 
des derivees continues d’ordre k — 4 au moins, done d’un ordre arbi¬ 
trairement elevc. 


665 



STABILITY BN DYNAMIQUB. 343 

Introduisons encore des nouvelles variables p, 

p = p, ? = +-P*W- 

En adoptant ces variables modifies la transformation T prend une 
forme analogue k la forme ci-dessus, 

p, = p[i -+- «*,($)p -*-•••]» 

p £*($)? + •••!» 

mais avec b, (}) = <„. Avec ces variables p, + 1’inUigrale invariante 
s’6crit _ _ 



Par une autre leg^re modification on peut choisir des variables 
nouvelles p*, ij<* pour lesquelles on ait 

*Je L4L' =I 
*p.?) ’ 

tout en ne modifiant pas la forme que nous venons d'icrire; en efTet 
on peut choisir p*, fac<> n c l ue 

+*=* el % = i.-pV(^T- 

Afin de simplifier, d<5signons ces variables p* et <|<* par p et | respecti- 
vement. 

En employant la condition = i pour p = o, on obtient 

imm^diatement par comparaison directe liquation a 7 (']>) = o. Ainsi 
nous pouvons 6crire T sous une forme encore plus sp£cialis£e 

p, = p [ i -+- ★ -i- «t(+'p f -*- • • -1* 

+, = + -+• P —p[#.H-f»*( + )p 

ou les fonctions k droite possedent encore des derivees d’ordre arbi- 
trairement £lev6. 

En continuant d'une maniere analogue, on obtient, apres d autres 


666 



344 GEORGE D. BIRKHOFP. — STABILITE EN DYNAMIQUE. 

modifications convenables des variables, la forme suivante pour T 
p,= p[i -+- ★ -4- ■+• A*(p, <|0p* -1 ],. 

+,= + -4- p + [top -+- *ip*-f - • • i- rc*(p. ^)p i ~ t l 

ou A a , B a sont des fonctions continues de classe C 1 (l arbitrairement 
grand) et d’ordre k — i au plus en p, et oil t 0 , <*-, sont des 

constantes. 

Mais avec les coordonnees rectangulaires u , v oil 

u = p* cos4*, v = p*sin^ 

cette transformation s’^crit 

lt l = u COS[(3 ■+■ 1,(11* 4- i»*) . .-4- /*_*(«* V s ) 4 -'] 

— i/ sin [P -f V s ) 4-.. /*_,(//*-♦- V ! ) 4 -') 4- U*(«, »>)» 

*;,= m sin[P 4- li -*(«**-*- 

4- P»in[(3 -h /.(a*-*- v*) r s ) 4 -«) -t- v), 

ou U A , V,. sont de classe C"'(m arbitrairement grand)et d’ordre k — i 

au plus en p. De plus jj dudv reste avec ces variables Pintegrale 

invariante. 

Voili la forme normale quc nous voulions obtenir. En effet, dans 
cette forme, la transformation est declasse C"' et formellement stable; 
neanmoins, l’origine est un point fixe instable. 

Done l'hypo these de stability dans le cas de stability formclle nous 
conduit au cas instable exclu , au moins si l'on admet des transforma¬ 
tions T de classe C 1 (/ arbitrairement grand) aussi bien quc les transfor¬ 
mations analytiques. 

Par consequent il semble certain que l’instabilite a lieu en general 
dans le cas de la stabilite formelle en Dynamique. 


667 



Reprinted from Annali D.R. Scuola Normale Superiore di Pisa , s. 2, 
Vol. 5, 1936, pp. 1-42. 


SUR LE PROBLfcME RESTREINT DES TROIS CORPS (•) 

(SECOND MEMOIRE) 

par Georoe D. Birkhofp (Cambridge, Mass.). 


II. 

Partie qualitative. 


I. Quelques prlliminaires. 

Repr6sentons la surface de section St dans le plan ordinaire, par exemple en 
employant p + const., et 0 comme coordonn^es polaires de ce plan. Ainsi S t devient 
une region limitle par deux cercles concentriques p — q' et p — p". Les points 0—0 
de cette region reprSsente des 6tats du mouvement oil la courbe correspondante 
traverse l’axe des x k angle droit dans le sens direct ou retrograde, mais k la 
droite de J (voir la fig. 1). Ces deux series d’etats qui coincident k l’origine c’est- 
il-dire au point p — 0—0 d e S t constituent une 
seule s^rie analytique. Nous dgsignerons la 
ligne correspondante par afi, oil a et p sont 
situes respectivement sur la courbe L t et sur 
la courbe L t (voir la fig. 4). 

De la mime manidre la ligne yd pour 
laquelle Q~n correspondra k la seule s€rie 
d’etats du mouvement oil la courbe traverse 
1 'axe des x & angle droit mais k la gauche 
de J. 

11 existe aussi une autre serie analogue k aff correspondent k des points de S 9 
qui ne sont pas situ6s sur S t quoiqu’ils soient, au moins en partie, situ£s sur 
sa continuation analytique. Cette sgrie comprend l’tftat de vitesse nulle k la droite 
de J. Les deux series constituent une courbe analytique ferrate dans S 3t it savoir 
les points p — —0 en les coordonnles rlgularisantes p, q, p\ q\ Semblablement 

il existe aussi une autre s£rie analogue k yd en dehors de S t . 

Soit a"fl" la ligne de S t dans laquelle les trajectoires qui sortent de la s6rie 
des points associgs k afi traversent St pour la premiere fois aprfes le temps t. 
La ligne symgtriquement situle a'fi’ sera alors la ligne dans laquelle les mgmes 



<•) Sur le probUme restreini des trots corps. Premier mimoire. Annali della R. Scuola 
Normale Superiore di Pisa, a. II, vol. IV, pp. 267-306. 


668 



2 


G. D. Birkhoff: Sur le pro bleme restreint des trois corps [10] 


trajectoires traversent S t pour la premiere fois a un temps r anterieur. II ne faut 
oublier ni la signification ggontetrique de S t , ni la syntetrie de S 2 et des courbes 
auxiliaires L* et L* de mouvement. 

Mais on a 6videmment les relations T{a'ft>) — a" ft" et R(a' ft') —a" ft" (*), d’ou 
il s’ensuit que U(a'ft') — a'ft', c’est-a-dire que a'ft' constitue la moitte de l’axe de 
la transformation involutive U, chaque point de a'ft' 6tant un point fixe de 17. 
L’autre moitte de cet axe est la ligne analogue y'b'. Semblablement les lignes a"/ V 
et y"b" constituent l’axe analogue de la transformation involutive U — RUR 
ou T=> UR. II est Evident que toutes les six lignes aft, yd, a'ft', y'b', a"ft", y"b" 
doivent fitre entterement distinctes. 

Si l’on faisait une deformation convenable du plan on pourrait regarder U comme 
une reflexion ordinaire par rapport & l’axe a'ft', y'b'. On voit done pourquoi les 
deux courbes aft, yd et les courbes a'ft', y'b' jouent des rfiles tout & fait analogues. 

Nous savons aussi que pour /* —0 on a 

Pi—P. 0,-0 — 27ia l (g,X), 

ou le dernier terme & droite est toujours plus petit que n cn valeur absolue, et 
petit en nteme temps que X. Done la transformation RT~ U devient 

Pi—P, 0, — — 0 + 27ra , (e,i). 

Par consequent les equations de a'ft' et y'b' pour /* — 0 sont respect!vement 

S-oa'fo,!) et 0-n(l +aV ,1)). 

Done ces lignes se trouvent toujours respectivement au-dessus de l’axe 0 —0 et 
au-dessous de cet axe, avec a.'ft', a"ft" prfcs de 0 — 0 et y'b', y"b" prfcs de 0 — n 
pour X petit (voir la fig. 4). 

De plus, la quantite 2 na\ oil a represente le demi-grand axe de l’ellipsc du 
mouvement, crolt quand un point parcourt aft de a a ft. Done les lignes aft et yb 
seront tournees par T vers la droite de la direction radiale; evidemment la ligne 
transformee par T de toute ligne radiale sera aussi tournee vers la droite ( 3 ). De la 
rndme mantere on voit que les images de toute ligne radiale par la transformation 
inverse T~ l seront tournees vers la gauche de la direction radiale. Cette propriete 
qualitative de la transformation des directions restera valable mSme pour p 
assez petit. En effet la transformation de (p, 0) en Jo,,0,) reste analytique mgme 
pour (g, 0) un peu a l’exterieur de L t et de L g . Done les images des lignes ra¬ 
diates varieront analytiquement avec p mSme en dehors de S g . 


(*) Nous avons vu que T—RU, ou R est une reflexion ordinaire en 0 = 0, et U est 
involutive. 

( 3 ) Voir III, sections 42-46. 


669 



[11] G. D. Birkhoff: Sur le probUme restreint des troia corps 3 

II eat Evident que lea transformations composted T k , pour A: — 1, 2, 3,...., font 
tourner lea directions radiales vers la droite et que lea transformations inverses T~ k 
font tourner les directions radiales vers la gauche. 

De plus, si P est un point de off? oil de / d\ done tel que U(P) — P, et si l’on 
ycrit T k =RU {k) oh k est impair et positif, on aura 

*-* k±l k—l k -i 

ZJtoT 2 (P)=>RT 2 (P)-(UR) 2 U(P)~-T * (/>). 

Par consequent 1’axe de la transformation involutive U <*> pour k impair et 
positif est la (k — l)/2 , * me image de o' ft' et y'd' par T~ l . 

De la mfime raanifere si P est un point de aft ou de yd, done tel que R(P) — P, 
on aura pour k pair et positif 

* h _ k k 

U^T 2 (/>) - RT*(P) - T 2 R(P) - T *(P). 

Par consequent l’axe de U ik) pour k pair est compose des k/2 lhmv% images de aft 
et yd par T~ l . 

2. - Hypothhse de la transitivite ordinaire. 

II semble etre presque certain que le probieme restreint des trois corps est 
transitif pour p positif et suffisament petit, quoiqu’on n’ait pas pu en faire la 
demonstration; a vrai dire le probieme beaucoup plus simple de la stabilite perma- 
nente au voisinage d’un mouvement periodique elliptique (stable) n’a pas ete r6- 
solu, malgre les plus grands efforts des geometres. 

Nous allons maintenant introduire 1 ’hypothise de la transitixrite ordinaire 
de T et de ses puissances : Etant donnes, pour ^=*=0 et petit, deux points quelcon- 
ques P et Q de la surface de section S t , et deux voisinages arbitrairement petits 
de ces points, on peut trouver dans ces voisinages respectifs, des points P' et Q' 
tels que P‘ soit transforme en Q" par une puissance convenable de T k (A—1,2,....), 
oh rentier k est arbitraire et donne d'avance. 

Une consequence immediate de cette hypothise est que les points fixes de T \ T* v ... 
sont toujours isoies. Autrement il y aurait une courbe analytique des points in¬ 
variants par rapport h une certaine puissance T k de T. Si cette courbe se trouvait 
complement h l’interieur de S tt elle diviserait S t en des parties invariantes 
par T k , et Ton voit que T k ne pourrait pas etre transitive. De la mfime manidre 
si la courbe des points fixes sort de S tt elle doit sortir un nombre pair de fois; 
mais, comme auparavant, cette courbe ne peut pas diviser S t en deux parties 
distinctes. Done la seule possibility est que la courbe s’ltend de L t h L t sans avoir 
aucun point multiple. Mais selon la symterie connue de T k par rapport & 0 — 0, 
1’image de cette courbe par rapport & 0 — 0 est formte ggalement de points fixes 
par rapport h T k . Ainsi les deux courbes diviseraient S en des parties invariantes, & 
moins que ces deux courbes symytriques ne se ryduisent h la ligne 0—0 ou d 0 —h. 


670 



4 


G. D. Birkhoff: Stir le probUme restreint des trois corps [12] 


Mais en ce cas tous les points de afi ou de yd seraient fixes pour T k , ce qui 
est impossible puisque T k tourne les directions radiales vers la droite. 

3. - Classification des monvements pgriodiques symgtriques. 

Toute courbe pgriodique symgtrique doit couper l’axe des x dans le plan des x, y, 
deux fois et deux fois seulement k angle droit. En effet choisissons deux points 
symgtriques de cette courbe. En faisant crottre r k l’un de ces points et dgcroitre 
d l’autre, on obtient deux arcs symgtriques. A un certain moment ces arcs se 
rguniront en un point de croisement k angle droit. En faisant crottre davantage t 
on obtient ngcessairement en une demi-pgriode un autre point de la mfime espgce. 
iyautre part un arc terming en deux points adjacents de cette espfcce joint k son 
image symgtrique par rapport k l’axe des x doit constituer une courbe complete 
de mouvement. 

A ce point de vue il y a dix classes possibles de tels mouvements pgriodiques 
symgtriques selon les types des deux croisements que nous divisons en quatre 
categories de la manigre suivante: 

< aa ): (ap,ap); (yd, yd) ; (/*', y'6') t 

(ab): (ap, yd); (a'?, f&) t 
(aa') : (ap,a'p'); (yd, y'd'), 

(ab'): (ap,/d'); (yd.a'p'). 

Les classes d’une de ces quatres categories (aa), (aft), (aa'), (ab') admettent gvi- 
demment la mgme espgce d’analyse mathgmatique. Par exemple, on trouve dans 
la premiere catggorie tous les mouvements pgriodiques symgtriques dont les deux 
croisements k angle droit appartiennent k une seule sgrie ap, a'p', yd, y'd'. 

Ces points de croisement doivent varier analytiquement avec p le long des 
lignes de symgtrie correspondantes. Ils ne peuvent pas disparattre sauf en passant 
par Li ou par L t , ou en coYncidant avec un autre point du mgme type. 

Pendant une telle variation de p les deux entiers caractgristiques ne peuvent 
pas changer sauf momentangraent aux instants de coYncidence quand le mouvement 
dgggngre en un mouvement pris x fois, et quand k et l sont remplacgs par kjy. 
et Ijx respectivement. En effet, pendant cette variation toute relation analytique 
identique de la forme T k (g, 6) — (p, 0 + 2&r) doit subsister indgfiniment. 

L’indice ( 4 ) d’un point fixe P de T k ne peut pas changer sauf aux moments 


( 4 ) En faisant circuler autour de P dans le sens positif un point Q dans le voisinege, le 
vecteur tourne d’un angle 2i.i, oil * est 1’indice susdit. Un examcn dgtaille, dans 

le cas d’une transformation qui conserve les aires, montre que 1 ’indice sera 1 , 0 , ou un 
entier nggatif; et que l’indice determine en grande partie le caractfcre du mouvement (stable 
ou instable, etc.). L’gtude des indices dans quelques cas simples a gte faite par PoincarS. 
La discussion la plus gengrale du cas d’un point fixe isolg se trouve dans mes Mgmoires III 


671 



5 


[13] G. D. Birkhoff: Sur le probletne restreint des trois corps 


de coincidence avec d’autres points fixes de T k . Une condition n(Scessairc d’un tel 
changement est que la sorame des indices reste la mdme avant et aprfes le moment 
de coincidence. 

Par exemple consid^rons la possibility la plus simple c’est-h-dire quand seu- 
lement deux points fixes simples P t et P t se confondent et disparaissent. Puisqu’il 
n’y a pas de points fixes aprfcs la coincidence, 1’indice total avant cette coinci¬ 
dence doit ytre nul. Done les indices avant la coincidence sont 1 et — 1 respecti- 
vement ou bien ils sont nuls tous les deux. Mais la seconde possibility peut se 
presenter seulement aux points fixes multiples. On pourrait s’imaginer que P x soit 
hyperbolique avec ^change des branches asymptotiques en cycles de l (l- 4=1) 
puisque 1’indice est alors 1. Mais dans ce cas les racines caractyristiques ryci- 
proques seraient identiquement des racines /'* raes de 1, mfime en coincidence. Cela 
n’est pas possible, puisque 1’indice de P t — P t est 0, tandis qu’h un point avec 
de telles racines caractyristiques l’indice est 1 toujours. 

Done le seul cas possible est celui des indices +1 et — 1. Le point P t de 
I’indice —1 doit ytre hyperbolique et instable avec quatre branches invariantes 


aboutissant & ce point. Ces 
branches sont groupyes ana- 
lytiquement en des couples. 
L’autre point P t avec 1’indice 
1 doit ytre elliptique et stable. 
La figure & droite yclaircira 
ce cas. Les courbes en poin- 




tiliys autour de P t reprysentent une famille de courbes presque invariantes qui 
l’entourent. Le cas limite de coincidence se trouve & la droite de la figure; en 
effet pour />, — P, le point fixe a un indice nul et doit avoir deux branches asym¬ 
ptotiques aboutissant & ce point. Ces branches sont reprysentyes par une mfime 
syrie formelle (*). Elies forment en gynyral un point de rebroussement a l’origine. 
II y a done dans ce cas deux mouvements pyriodiques, l’un elliptique (stable), 
l’autre hyperbolique (instable) avec quatre branches asymptotiques aboutissantes, 
qui coincident en un mouvement hyperbolique avec deux branches aboutissantes 


avant de disparaltre. 

Nous allons appeler dans ce qui suit « primaires » les mouvements pyriodiques 
dont les entiers caractyristiques k et l n’ont pas de facteur commun; « secon- 
daires * si k et / admettent seulement le facteur 2 en commun; «tertiaires» 


et IV. Le cas d’une dygynyrescence exti-yine en des courbes analytiques des points fixes, qui 
n'entre pas ici & cause de la transitivity supposye, reste encore ft fttre ytudiy. 

( J ) 8i cette s6rie converge les deux branches ne forment qu’une seule branche analy- 
tique qui passe par le point; autrement les deux branches sont analytiques sauf au but oh 
ellcs sont « hypercontinues •. Voir III, section 36. 


672 



6 


G. D. Birkhoff: Sur le problbne restraint des trois corps [14] 

s’ils admettent le facteur 3; et ainsi de suite, les mouvements de la s i * n,e espfcce 
6tant ceux pour lesquels les entiers caractSristiques k et l admettent pr^cisement 
le facteur commun s. 

Remarquons que pour /i —0 tous les mouvements pSriodiques sont primaires. 
En effet si T k (g, 6) = (g, 0 + 2ln) et si k et t admettent le facteur commun q 
avec k — qk lt l=~qlt, on aura aussi T k '(g, 0) = (g, 0 + 2/,*). II est Evident que k t 
et l t seront les entiers caract^ristiques de ce mouvement pgriodique pour p-*0. 

En g<$n6ral si Ton a T k (g, 0)“(e, 0 + 2/*) le rapport 2 ln/k indique (quelque 
soit p) l’avance angulaire moyenne du point p4riodique (g, 0) par 1’iteration r 6- 
p6t6e de T. Done les entiers caractSristiques k ly l, ne peuvent Stre que des sous- 
multiples Sgaux de k, l, avec 


4 . - line condition n^cessaire pour 1’existence des iiiouveinents plriodiques. 

Une condition ntcessaire pour que des mouvements pgriodiques, symt- 

triques ou non, avec des entiers caract&ristiques k, t puissent exister est 

que Von ait *i« „ 

o > —£~ > O , 


oil o' et a" desu/nent les deux coefficients de rotation le long de L t et 
de Lt respectivement. 

Pour dgmontrer ce fait, remarquons que la ligne T m (ap) par exemple 
(m —1, 2, 3,....) aura un point a (, "> sur L t qui s'avance avec une vitesse angulaire 
moyenne o" quand la transformation T est ind^finiment r£p<5t6e; d’autre part la 
vitesse moyenne de p {,n) sera o'. Remarquons aussi que puisque toute courbe T ,n (ap) 

est tournee vers la droite de la direction radiale, 
les points fH m) et a (m > doivent fitre respectivement 
le plus & la droite et le plus & la gauche en sens 
angulaire de tous les points de T m (ap) (voir la 
fig. 6). Par consequent on aura o / >o" ( 6 ). D’autre 
part la vitesse angulaire moyenne d’un point p<$- 
riodique P de ap est 2bi/k. Done on doit avoir les 
in€galit£s 6nonc6es pour un tel point p6riodique P. 
En partant de la ligne radiale qui le contient, 
on demontre analoguement les in£galit£s SnoncSes, pour tout autre point p6rio- 



Fig. 6. 


dique. 

Le cas oil 2l7t/k = o' ou o" est celui ou il existe un coefficient de rotation com¬ 
mensurable avec 2 n le long de L t ou de Lt respectivement. Pour une telle valeur 
de p un mouvement pgriodique vient & coincider avec L 2 ou L t respectivement. 


('•) Voir III, section 46. 


673 



[ 16J G. D. Birkhopf: Sur le probltme restreint des trois corps 


7 


5. - Les inoQvements pgriodiques symgtriques primaires. 

Gtudions maintenant les mouvements pgriodiques symgtriques primaires. Nous 
supposerons que la condition ngcessaire se trouve satisfaite. 

II n’eziste jamais de mouvements ptriodiques symitriques primaires 
de la premitre cattgorie (aa). 

fividemment le raisonneraent sera le m«me pour tous les quatre types de cette 
catggorie. Considgrons le type (ap, ap) par exemple. 

S’il existe un tel mouvement primaire, soient k et / (sans facteur commun) 
les deux entiers caractgristiques pour un tel point (p, 0) de a/9. Nous aurons done 

7*(Q t 0)-{g,2ln) 

par hypothgse. Mais il y a aussi un autre point (p, 0) de a/9 dans la sgrie des 
points transform^ par T,...^ T k ~ i . Soit (p, 0) ce point avec 

T*(g,0)-(Q,2rn) (0 <V <k). 

II s'ensuit maintenant de la symgtrie que l'on doit avoir aussi 

T-rfa ty-fa-Wn). 

En combinant ces relations, on obtient 

***<9,0 )-<p, Al'n). 

Par consequent on conclut que k-2k\ 1-21', ce qui est impossible puisque k 
et l n’ont aucun facteur en commun. 

Dgmontrons maintenant les faits suivants concernant les mouvements pgrio- 
diques des autres categories ( ab), (aa'), (ab'). 

Si la condition ntcessaire est satisfaite il existera un nombre impair 
de mouvements ptriodiques symitriques primaires de chaque type des 
respectives categories : (ab) si k est pair et l est impair ; (aa') si k est 
impair et l est pair ; (ab') si k et l sont impairs. 

Ces mouvements primaires varieront analytiquement avec p et peuvent 
seulement paraitre ou disparaitre a la coincidence de deux mouvements 
primaires de la mime espice ou, plus gtntralement, a la coincidence d'un 
nombre pair de tels mouvements. 

Supposons en premier lieu que k soit pair et que l soit impair. Nous pouvons 
supposer ici qu’un des croisements & angle droit correspond & un point (p, 0) de ap. 

k _k 

Par la definition de k et l nous aurons 7*(p, 0) — (p, 2ln) d’ou T*(g, 0 )-=T *(p, 2ln). 

* * 

Mais la symgtrie des transformations T 2 et T 2 nous montre que T*(g, 0) 
et T 2 (p, 0) doivent etre des points symgtriques par rapport ft 0 = 0. Il en rgsulte 


674 



8 


G. D. Birkhoff: Sur le problime restreint des trois corps [16] 


ainsi que ces points se trouvent sur 0 = 0 ou 0—*r, et qu’ils coincident g6om6- 

* _ * 

triquement. Nous pouvons done 6crire T*(e, 0) — (p, Xn) et T 2 (p, 0) = (p, — Xn). 
Ainsi, en comparant, nous obtenons A = /. Puisque l est impair le point (p, In) 
se trouve sur la ligne 0 = ;r. Par consequent, quand k est pair et L est impair, 
un mouvement correspondent primaire doit £tre de la catygorie ( ab ). 

Supposons maintenant que k soit impair et que / soit pair. Supposons aussi, 
par exemple, qu’un des croisements & angle droit corresponde a un point (p, 0) 
de aft- Nous aurons done T k (g, 0) — (p, 2In). Considerons maintenant les points 

k-l k-l 

symetriques T 2 (p, 0) et T 2 (p, 0) que nous appellons Q et Q. Evidemment 
nous avons Q—T(Q) en m£me temps que Q — R(Q). Par consequent nous 
avons U(Q)—Q • Done le point Q sc trouve sur la ligne a'P' ou sur la ligne y'6'. 
Soit (p, Q + 2Xn) ce point oil 0<0<n si Q se trouve sur a'p' et ou — n<0< 0 
si le point Q se trouve sur y'6'. Nous avons alors 

k-\ 

T 2 ( e ,0)-(e, -e~2Xn). 

Mais si nous supposons que R est la reflexion en 0--O, et si ensuite nous 
definissons U—RT, nous trouvons dans le premier cas 0<0<n 

*+« _ 

T 2 (e,0)-T(Q t 0 + 2Xn)-RU(Q,0 + 2Xn)-R(Q,0-2Xn)-(Q, -0 + 2Xn) 

puisque U laisse les points de a'p' invariants. Par comparison nous concluons 
que A — 1/2, ce qui est possible puisque / est pair selon notre hypothese. Dans 
le second cas nous trouvons 

*+» _ _ 

T 2 (p, 0) — i? U(q, 0 -f 2Xn) — i?(p, 0 — 2(A — 1 )/r) — (p, -0 + 2(A-l)7r) 

puisque, pour les points (p, 0) de y'6\ on a U(q, 0) — (p, 0 + 2.-r). Par comparaison 
il resulte l’egalite A — (/+1)/2, ce qui contredirait notre hypothfcse. Done le mou¬ 
vement doit fitre dans ce cas de la catygorie {aa'). 

fividerament si k et / son! impairs tout mouvement primaire doit dtre de la 
catygorie (ab') selon ce mGme raisonnement. 

II reste a dymontrer que ces mouvements primaires sont en nombre impair 
et varient analytiquement avec p de la mani&re 6nonc4e. Pour le faire voir dans 
le premier cas nous raisonnerons de la mantere suivante. 

* 

Puisque la courbe 7^(a/J) est tournee vers la droite de la direction radiale, 
elle coupera yd (0 = In) en un nombre impair de points ou elle ne la coupera 

k 

en aucun point. Mais cette seconde possibility n’a pas lieu. Autrement T 2 (ap) 
ne couperait pas 0= — In. Souvenons-nous maintenant de la symytrie de ces deux 


675 



[17] G. D. Birkhoff: Sur le probUme restreint des trois corps 


9 


courbes par rapport a 0 — 0 et du fait que la rotation angulaire relative aux 
points J* et a^ — 7*(a *) est mesur€e par k&, et aux points /$(”*) et 

ho”. En faisant aller le point P(g, 0) de a le long de T*(a/i) et en 


par 


inSme temps le point symStrique (p, -0) de a' 2 > a /J' on voit qu’on aura 
0“ In un nombre impair de fois. Ce sont prlcisement les points primaires du 
type (aft, yd), dont il est question. 

Evidemment ces points varieront avec p analytiquement de la manifcre <Snonc6e. 
Les deux autres cas se traitent d’une fa$on analogue. 


C. - Les inouvements pcriodiqucs sym6triqoes secondaires. 

Nous avons vu qu’il n’existe pas de mouvements plriodiques sym6triques pri¬ 
maires de la premiere catlgorie. En revanche nous allons d6montrer qu’if n’existc 
jamais de mouvements pfriodiques symetriques secondaires sauf de la 
premiere categoric. 

Pour commencer, supposons qu’il y ait un mouvement pgriodique symC-trique 
de la deuxi&me catdgorie, par exemple du type (a/?, yd), qui est secondaire. Nous 
pouvons poser alors k—2k t , /—2/, oil k, et /, sont sans facteur commun. De 
plus nous devons avoir 

0) - <p, Xn), T k "(g, x x) - <p, 4 l t n) 

oil k' +k" —k—2k x et ou X est impair. En effet il y a un 6tat de croisement & 
angle droit au point (p, 0) de afi; puis, aprfcs k' — l intersections avec la surface 
de section S t , il y a un autre point de croisement & angle droit au point (p, Xn) 
de yd; puis aprfcs k" — 1 autres intersections, le point de la trajectoire rovient 
a (p, 4/,*). 

De plus, on a T~ k (p, 0) — (p, — A*) & cause de la symStrie, done 

T'*(g t 0)-(g,2X*). 

Puisque le mouvement est secondaire il s’ensuit que V-A A—2/,. Mais X 

dtait impair selon notre hypothfese. 

Si nous avions commence en supposant que le mouvement considgrg 6tait de 
la troisifcme cat^gorie ( aa '), par exemple du type (a/J, a’?), nous raisonnerions 
d’une manure analogue. Ici nous devons avoir 

T*{Q t 0) - <p, 0 + 2A*), 7*"(p, 0) - (p, (4/, - 2X)ji) 

oil (p, 0) est un point de a'fi’ avec O<0<* et oil V + k" — Ar — 2k t . Mais selon 
la premiere Equation on a T~ k (p, 0) — (p, — 0— 2 Aji)~ T(g, 0 — 2Xjz) puisque 
T(g, Q) a RU(Q, 0) = (0» —0). Done on obtient 7* ,r+l (p l 0) = (p, 4 Aji). On en con- 
clurait que 2Ar'+l=2Ar,, A — 21,, ce qui est absurde. 


676 



10 


G. D. Birkhoff: Sur le probleme restreint des Irois corps (18] 


De la mfime manifcre la possibility que le mouvement secondaire 6oit de la cat£- 
gorie (ah') doit Stre exclue. 

II reste k consider la possibility des mouvements pyriodiques symytriques 
secondaires de la premifcre catygorie, par exemple du type (aft, aft). Dans ce cas 
nous aurons 

T k (g, 0) - (g, 2^r), T*’Ce, 0) = (e, (4/, - 2A)n). 

avec k' + k" — k-=2k t . En employant la symytrie, comme auparavant, nous obte- 
nons r- A '(p,0)-(p f -2 lit), done 2™ (e, 0) - (p, 4A*), et ainsi k' = k"=k x , A-/,. 
Ces yquations deviennent alors 

T*'(g, 0) - (g, 2/,7i), T*(Q> 0) - (P, 2/.7T), 

ou o + p puisque le mouvement cst secondaire. 

Mais la deuxifcme yquation rysulte de la premiere et de la symytrie de T. En 
effet nous obtenons 7 ,_ *‘(p, 0) — (g, — 2/,ti), ce qui yquivaut k la deuxifcme yquation. 
Done la seule yquation T k >(g,0)-(g,2l t 7t) avec p=*=p donne ces mouvements se¬ 
condaires. 

Ainsi on voit que les intersections de T k '(aft) avec aft (0 — 2/,*) donnent les 
points secondaires rattachys k 2k lt 2l lt ou bien les points primaires rattachys kk t , l, 
suivant que p4=p ou p — p. Tous les points pour lequels on a p —p seront nyces- 
sairement primaires avec des entiers caractyristiques k K et Ces points existeront 
toujours en nombre impair si la condition nycessaire se trouve satisfaite, comme 
nous avons vu. Les autres points sont groupys en des couples (p, 0), (p, 0), chaque 
couple correspondent k un mouvement secondaire. 

Supposons que p croft k partir de p — 0. Pour p trfcs petit il n’existera aucun 
point pyriodique symytrique secondaire du type (aft, aft) avec les entiers carac¬ 
tyristiques (2k t , 21,). En effet T lk '(aft) et aft se croisent, pour /< — 0 et mOme 
pour p assez petit, une seule fois au plus en un point primaire. 

Comment peuvent s’introduire les autres points de croisement de T k, (uft) 
avec aft qui correspondent k des points secondaires du type (aft, aft) appartenant 

aux entiers caractyristiques (2Ar,, 2/,)? Nous 
allons voir que cela peut arriver de deux 
manteres assez diffyrentes. 

Au moment p—p 0 quand de nouveaux 
points de cette espfcce s’introduisent ou de 
tels points disparaissent, la courbe T kl (aft) 
devient tangente k aft aux points de coinci¬ 
dence correspondants. Considyrons le voisi- 
nage d’un tel point. 

Supposons en premier lieu que ce 
point P n’est pas un point primaire (voir la fig. 7), done il est secondaire 



677 




[19J G. D. Birkhopf: Sur le problime restreint des trois corps 


11 


L’image T~ k '(aP) de aft par T k> occupera une position symgtrique par rapport 
a ap (voir le trait pointing de la fig. 7). 

II y a ici deux cas selon que T k '(ap) coupe ap ou ne la coupe pas au point P. 
Si T k '{ap) ne la coupe pas, il y a un contact double dans le cas - ggngral» (’). 
Evidemment a ce moment il ne peut se produire que l’apparition ou la dispa- 
rition de deux points simples de cette espgce. Mais alors il y a un autre point ~P 
de ap qui appartient au mgme groupe de points ou P— T k '(P) (voir la fig. 7). 
La transformation T~ k ' change T^iap) en ap ± et ap en T~ k '(ap) qui est 1’image 
symgtrique de T k '(ap). Done il y a aussi en P un point double de la mgme espgce 
que celui en P. 

Examinons les indices dans ce cas. Si u et v sont des coordonnges symgtriques* 
dans S t qui s’gvanouissent au point fixe P de 7****, la transformation correspon- 
dante peut gtre gcrite dans la forme 

m — au + bv + fu* + ... n v—cu + dv + gu* + .... 

ob ad —be— 1 (voir l'gquation (33), partie I). Mais l'axe des u(ap) peut gtre 
transform^ en une courbe tangente avec contact double, seulement dans le cas 
oil c—0 et ou le coefficient g de u* dans la sgrie de v n’est pas nul. Il s’ensuit 
que la transformation T*** doit avoir la forme plus spgciale 

u ™ au + bv + ftp + ...., v— + ^ v+gu* + .... 

IVautre part & cause de la symgtrie de T ,kl et de T~ tkl , la transformation de (w, v) 
en (6, v) doit gtre identique k la transformation de (u, — w) en (u, —v) e’est-g-dire 

u ■■ aii — bv + fu* + — v— — | v + gu* + .... 

Par consgquent on doit avoir a — ±1. Dans le cas a—— 1 on doit avoir g—0, 
ce cas est done inadmissible et Ton a a — 1. II rgsulte aussi qu’on doit avoir 2f—bg. 
Done la transformation T ,kt en de telles coordonnges sera de la forme 


u — u + b(v+'-gu*)+.... t v-v + gu*+ .... 

Mais on aura alors 

gu* + .... 


— v 




Le vecteur correspondent (u, v) — (m, D) aura presque la mgme direction que celui 


dgfini par 


v—v gu* dv gu 1 

inu-*r ou P ar a—* 


(’) C'est le seul cas que nous considgrons ici, afin d’gviter dea complications inutiles. 
Tous lea instruments analytiques ngcossaires dans le cas vraiment ggngral sont dgveloppgs 
dans mon Mgmoire III. 


678 



12 


G. D. Birkhoff: Sur le problbme restreint des trois corps [20J 


Lee oourbes integrates de cette equation differentielle sont 

\gu>-\bv'-C. 

Done l’indice correspondant doit fttre 0. 

Par consequent dans ce cas l’un des nouveaux points secondaires aura 1’in¬ 
dice 1. Les racines reciproques de liquation caracteristique sont alors imaginaires. 
prfts de 1, et de module 1, puisque les termes lindaires dans les series varient 
d'une fagon continue avec p. L’autre de ces points aura l’indice —1, avec des 



Pig. 8. 


racines caracteristiques reelles reciproques prfts de 1. Le premier point est ellip- 
tique et en general du type stable. L’autre point est hyperbolique et instable avec 
quatre branches asymptotiques aboutissant ft ce point. 

II reste ft considerer le cas oh la courbe T k '(ap) devient tangente ft ap en 
un point P de a/? qui est primaire et du type (ap, yd) ou d’un des types ( ap, a'P'), 
( aP, y'd') selon que k t est pair ou impair. Nous allons traiter le cas de k { pair. 
Le cas de k t impair est tout ft fait analogue. 

Remarquons en premier lieu que dans ce cas l’ordre de contact au point P 
doit fttre au moins trois. En effet supposons que l’ordre de contact soit deux. 
Selon le raisonnement de plus haut la forme de la transformation T kl sera la mftme 
que celle de 7* kl ; et les deux directions des tangentes menses en P coi'ncideront 
avec celle de l’axe positif (voir la fig. 8). Done si deux points P lt P t apparaissent 
(voir la fig. 8), ils ne peuvent pas fttre 6chang6s entre eux par la transfor¬ 
mation T kl . Ici on aurait tout simplement deux nouveaux points primaires; nous 
n’avons aucun besoin de poursuivre l'^tude de ce cas. 

Nous avons done ft considerer maintenant le cas de contact du troisieme ordre, 


679 






G. D. Birkhoff : Sur le probleme restreint des trois corps 


13 


ou les deux courbes se croisent. Mais, selon la mfime analyse des series pour 7 ,fc ', 
le cas a»l ne peut pas se presenter ici. Autrement les nouveaux points seraient 
encore primaires. Ainsi nous obtenons n^cessairement pour T kt , en P, la forme 

5— — u + fu*+~ t t»— —t> + » + huv + kv* + lu* + .... (f=*=0), 

et la fig. 9 est celle correspondante. On voit ici que P t et P t sont Schanggs 
entre eux par T k> tandis que le point P entre P , et P t doit Stre transform^ en 
lui-m£me. 

Done nous aurons le point primaire simple P du type (aft, yd) qui varie ana- 
lytiquement et deux points secondaires du type (aft, aft) dans le voisinage. L’in- 



Fig. 9 . 

dice des deux points de ce nouveau groupe est +1 et les points peuvent fitre 
stables ou directement instables. 

Je dis qu’en infime temps il y a un contact de troisidme ordre de IT^yd) 
avec yd au point P de yd assocte a P. A vrai dire, la forme de la transformation T k • 
est la mfime au point associl P qu’au point P, au moins par rapport aux axes 
convenables & P. En particular les racines caractlristiques sont toutes deux 6gales 
a — 1. En prenant en P des coordonnles sym^triques analogues et en nous sou- 
venant du fait que T k> renverse toutes les directions au point P aussi bien qu’au 
point P, il nous devient Evident que 7*' doit avoir la forme 

ii— —- u+... n if — —v + .... 

Par consequent la Ar, 1 *" 1 ® image de yd aura un contact du deuxi&me ordre au 
moins avec yd. Mais si l’ordre de contact etait seulement deux, les courbes T^(yd) 
et T k '(yd) (qui est l’image sym^trique de 7*»(yd)) auraient aussi un contact 
direct de cet ordre et il y aurait deux points primaires du type (aft, yd) dans le 
voisinage, ce qui est impossible dans le cas que nous considlrons. 

Done scion notre hypoth&se il y a un contact de troisfeme ordre au point P 
aussi bien qu’au point P. Par consequent au m£me moment il apparattra deux 
mouvements p£riodiques symetriques secondaires appartenant & 2k t , 2/,, dans le 
voisinage du mouvement primaire du type ( aft, yd). Ces mouvements secondaires 
sont des deux types (aft, aft) et (yd, yd) de la premiere catggorie. 

Si Ar, est impair, a'ft’ ou y'd' doit remplacer yd dans ce raisonnement, suivant 
que /, est pair ou impair. 

Pour gtendre ce raisonnement au cas le plus g€n£ral nous aurions besoin de 
la th6orie des points fixes les plus compliquSs. Le fait essentiel dont depend le 


680 




14 G. D. Birkhoff: Sur le problime restreint des trots corps [22] 

raisonnement est qu’il y a une symCtrie complete par rapport & aft, yd et & a'ft', 
yd' en des coordonnSes convenables. 

Les mouvements periodiques symetriques secondaires de la premiere 
catigorie peuvent apparaitre ou disparaitre en des couples du mime type 
ou meme de deux types difterents. Cette derniere possibility aura lieu en 
giniral quand ils paraissent ou disparaissent dans le voisinage immediat 
d’un mouvement primaire ayant le mime rapport caracteristique, autour 
duquel ils circulent deux fois pendant une periods. 

7. Les mouvements periodiques symetriques tertiuires. etc. 

Les mdmes raisonnements nous conduisent aux r£sultats analogues concernant 
les espdees plus 616v6es: 

Soient k—skt et /— sl t (s>l) des entiers caracteristiques donnis ou k, 
et It sont sans facteur commun. Quand s est impair, tout mouvement 
p&riodique symitrique de la s' ime espece avec ces entiers caracteristiques 
sera des categories respectives (ab), (aa') ou ( ab') suivant qu’on a k x pair 
et It impair, k % impair et /, pair ou k t et l t impairs. Quand s est pair 
tout mouvement p&riodique symitrique correspondant doit itre de la pre¬ 
miere catigorie (aa). 

De tels mouvements de la s i * mt espece peuvent apparaitre ou dispa¬ 
raitre en des couples du mime type pour s impair. Pour s pair ils peuvent 
apparaitre ou disparaitre de cette maniire, ou mime les deux mouvements 
qui coincident peuvent itre de deux types difterents de la premiere cate¬ 
goric. 


8. Remarque sur les interrelations des iiiouveinonts symetriques. 
Si Ton voulait aller plus loin encore avec l’analyse des mouvements 


syme¬ 


triques on pourrait regarder l’entrelacement 


des images de aft, yd, a'ft', y'd' 
avec aft, yd, a'ft', y'd' com me une 
espfcee de symbole complet pour 
I’arrangcment relatif des mouve¬ 
ments periodiques. Par exemple 
le symbole partiel d gauche, ou 
k lt l t n'ont pas de facteur com¬ 
mun, et sont tous les deux im¬ 
pairs, indiquel'existenced’un seul 
mouvement periodique sym6tri- 
que primaire du type (aft, yd) 
avec des entiers caract&ristiques 
2 k lt l K . Ceci correspondrait au point P. 11 existerait aussi un seul mouvement secon- 
daire du type (aft, aft) avec des entiers caract&ristiques 44-,, 2/,, avec lequel doit 



681 



[23] G. D. Birkhoff: Sur le probUme restreint des trois corps 15 

etre associd un autre mouvement secondaire du type (yd, yd). Ce mouvement du 
type (aft, aft) correspond aux points P t , P t du symbole partiel. 

Nous n’allons pas pousser plus loin I'ltude de cette espftce de symbolisme, qui 
jouit de nombreuses proprietS int£- 
ressantes. Par exemple, ft cause du fait 
que les images de aft et yd par T, 

7**,...., sont toujours « tournees vers la 
droite », le symbole partiel indiqud est 
essentiellcmcnt le seul qui soit possible 
dans le cas de trois croisements de aft 
par T tk, (aft). Ainsi il s’ensuit quc le 
mouvement symetrique secondaire du 
type (aft, aft) donne lieu ft deux croi¬ 
sements ft angle droit avec l’axe des x, 

I’un ft droite (p plus grand) et l’autre 
ft gauche (q plus petit) du croisement du mouvement pSiodique symetrique pri- 
maire (voir la fig. 11). 

1). • Coincidence des mouvements syinetriques avec L x ou L t . 

Jusqu’ici nous n’avons pas consider^ le cas /,/A:, — o'ou o", ou tous les mou¬ 
vements pSiodiques symetriques avec A://—A:,//, viennent k coincider avec L x 
ou Lj, et puis disparattre. II ne faut pas oublier que la transformation T se 
trouve definie mdme un peu en dehors de L, et L t . 

II est ft remarquer ici que pour /,/A:, — o', par exemple, a, y, a' et y scront 
transform^ en a, y, a' et y' respectivement par T k '. En effet supposons par exemple 
que k x soit pair et /, impair. II existera jusqu’au moment de coincidence deux 
mouvements symetriques primaires, l’un de type (aft, yd), l’autre de type (a 1 ft', y’d'), 
rattachS aux entiers caractlristiques k x , /,. Done a, y, a’, y' seront des points 
fixes de T k> dans ce cas. Les autres cas peuvent fttre traits d’une maniftre ana¬ 
logue en employant nos rSultats precedents sur l’existence de mouvements 
pSiodiques primaires. 

Maintenant, remarquons que toutes les trajectoires periodiques qui se trouvent 
dans le voisinage de L x traversent S, et l’extension de S, un peu en dehors 
de L x alternativement. Par consequent quand o' passe par 2^r/,/A:, tous les mou¬ 
vements periodiques primaires correspondant (symetriques ou non symetriques) 
disparaissent ft la fois en une coincidence multiple avec L x ; les autres mouvements 
pSiodiques correspondants d’espftce superieure auront dejft disparu puisque les 
images successives d'un rayon quelconque sont situSs en sens angulaire entre 
leurs deux points extremes. 

On voit done quc dans le cas o' (ou a")~27il x jk x tous les mouvements 
periodiques primaires, symetriques ou non symetriques, avec le rapport 





Fig. II. 


682 





16 ■ G- D- Birkhoff: Sur le probUme restreint des trois corps [24] 

caractSristique It/ki viennent a coincider avec L% (ou avec L t ), et puis dispa- 
rdissent, tandis que les mouvements periodiques d’esptce sup&rieure auront 
dtja disparu. 

Au moment de coincidence les points de L x (ou L t ) qui se trouvent 
sur tine axe de sy mg trie aft, yd, a'ft', y'd', et toutes leures images par T, 
TT kx ~ l sont des points fixes de la transformation F k *. 

10. - Sur ('existence des iiiouveinents periodiques d'especes nrbitrairement 
ElEvEes. 

On peut se demander si les mouvements periodiques symEtriques de la s ii?mc 
espdee existent pour s quelconque. 

Si la condition ntcessaire, o'>2nl l /k l >o", se trouve satisfaite, il exis- 
tera toujours pour p^O des mouocmcnts periodiques symitriques avec 
des entiers caract6ristiques sk t , sl t oil s est arbitrairement grand. 

En effet supposons que le cas contraire ait lieu. II existera alors en particular 
un nombre fini de croisements de la suite T(aft), T*(afi),...., avec aft, dont les 
entiers caractEristiques correspondants seront de la forme sk t , «/,. Done on peut 
trouver des Equimultiples k et l des entiers k t et /, tels que F k laisse invariants 
tous ces points. Toutes les images successives de aft par T k et ses puissances coupe- 
raient aft (6 — 2ln) en ces points et en ces points seulement. ConsidErons maintenant 
le point de croisement P de aft qui est le plus prEs de a. II devrait Etre transform^ 

en le point de T k (aft) le plus pres 
de a {k) . Mais le premier point de ren¬ 
contre sur T k (nft) doit Etre transform^ 
en le dernier point sur aft, puisque 
T k (aft) est tournEe vers la droite de 
la direction radiale. Done il faut qu’il 
y ait un seul point de croisement, P. 
fividemment le point P doit etre pri- 
maire, et toutes les images de aft par 
T k , T :k ,...., ne se couperont qu’en P 
(voir la fig. 12). Nous voyons done que la rEgion 2 t formE par aPa ik > et par 
toutes ses images, soit par T k , soit par T~ k est une rEgion ouverte, symEtrique 
par rapport b aft, qui est tout b fait distincte de la rEgion analogue 2 t formEe 
par ftPflW et par ses images. Nous regardons ici la surface annulaire .9. comme 
une surface de Riemann d’un nombre infini de feuillets. 

S’il n’existait meme pas des points congruents a 1’intErieur de 2\ et de 2, 
respectivement, on obtiendrait une conclusion impossible. En effet 2 X et 2» forme- 
raient alors deux rEgions distinctes de qui seraient invariantes par T k . Cela 
contredirait 1’hypothEse de transitivite. 

Mais pour dEmontrer que de tels points congruents ne peuvent pas exister. 



683 




[25] G. D. Birkhoff: Sur Le probleme restreint des trois corps 17 

il suffit de demontrer que, dans le voisinage de n’importe quel point interieur 
de 2“,, il doit exister des points dont la coordonnee 0 croisse jusqu’a une valeur 
arbitrairement grande et positive avec 1’iteration repetee de T k ; et qu'en m&me 
temps la propriete analogue subsiste pour tous les points de 2 f , c’est-ft-dire qu’il 
existe des points voisins dont la coordonnee 0 decroisse jusqu’fc une valeur 
arbitrairement grande et negative. 

En effet soit P un tel point (p, G) de 2\ dont le point congruent (p, Q + 2bx) 
appartient & 2’,. Si la propria susdite a lieu, on pourrait trouver dans le 
voisinage immddiat de (p, 0) une petite region de 2 t qui aprfcs j iterations de T k 
se trouve au-dessus de l'axe des G , en m£me temps que la petite region congruente 
de 2'i se trouve au-dessous de cette axe aprfcs l iterations de T k . Mais une region 
de 2"i au-dessus de l’axe des G et une region de 2’, au-dessous de cette axe y 
restent aprfcs la repetition indefinie de T k , comme le montre la figure. Done cette 
conclusion est absurde. 

Il ne reste done qu’a deinontrer la propriete susdite pour voir que le resultat 
enonce doit dtre vrai. Si cette propriete n’a pas lieu nous pourrions trouver un 
petit voisinage n dans 2, dont toutes les images par T k restent bornees en 0 
superieurement. Mais un tel voisinage se trouve finalement transforme au-dessus 
de la ligne 0 — 0. Done toutes les images de o sauf un nombre fini se trouvent 
dans une region OSi0^2m.T. De plus elles n’ont pas de point commun. Mais 
les integrates invariantes de toutes ces regions sont egalcs, et en mfime temps 
I’integrale totale est finie, ce qui est impossible. Par consequent les images ne 
peuvent pas dtre bornees superieurement en 0. 

11. Sur I’existence d’nutres mouvements periodiques. 

A tout mouvement periodique non symetrique correspond toujours un inou- 
vement periodique de position symetrique par rapport d l’axe des 0. Ces deux 
mouvements sont parcourus en ordre inverse du temps i. Cela signifie que 
si P est un point periodique de S, qui ne se trouve ni sur a/5, yd, a'fi', y’d’ r ni 
sur une de leurs images par T, le point R(P) aura la meme propriete aussi. 
En effet si T k (P)-P on aura RT k (P)-R(P) done T k (RP)-RP. Ces deux 
mouvements associes sont les rndmes & tous les egards, en tenant compte de 
1’inversion de l’ordre du temps. Par exemple leurs indices et leurs entiers caracte* 
ristiques sont les memes. 

Si, quand p varie, deux mouvements periodiques (non svmetriques) coincident 
en un mouvement non symetrique multiple il y a en meme temps deux mouve- 
inents associes (non symetriques) qui coincident avec le mouvement associe 
multiple. II pourrait arriver aussi que deux mouvements ( 8 ) non symetriques 
co’incideraient l’un avec l’autre en un mouvement symetrique. 


(■) Non* considemns comme auparavant le cas ivpe le plus simple. 


684 



18 


G. D. Birkhoff: Sur le probldme restreint des trois corps [26J 


Dans la premiere partie nous avons d6j& vu qu’il existe probableraent, pour C 
assez grand et p petit, des mouvements pgriodiques non sym^triques qui pour p = 0 
sc r^duisenl k des mouvements plriodiques non symgtriques. Nous voulons montrer 
maintenant pourquoi des mouvements pgriodiques non sym^triques d’une toute 
autre nature doivent exister, et cela d’une fagon trfcs varige quelque soit p =¥■ 0. 

Nous allons raisonner en nous appuyant sur la propri€t£ suivante des branches 
asymptotiques des points pgriodiques, que j’ai d€montr€e dans mon M6moire 
Pontifical: dans le cas transitif toute branche asymptotique a ou to (’) non double 
d’un point hyperbolique doit fitre partout dense dans S t et coupera toutes les 
autres branches non doubles o> ou a respectivement un nombre infini de fois (*°). 



fcvidemment deux branches a ou deux branches o> ne peuvent avoir aucun point 
commun. 

Selon ce qui prScfcde il y a des points p€riodiques symgtriques de T k pour k 
et l convenablement choisis qui poss&dent des indices z6ro ou nggatifs puisque 
1’indice total pour chaque k et / cst toujours nul. Ces points seront du type 
hyperbolique avec des branches asymptotiques qui sont analytiques ou hyper- 
continues ("). Soit P un tel point plriodique sym<*trique. A cause de la symgtrie 
ses branches seront sym6triquement disposes par rapport k l'axe des 0, si P se 
trouve sur aft par exemple. 

Supposons en premier lieu qu'au moins un des deux couples de branches 
sym^triques adjacentes k l’axe de symStrie ne soit pas double; ils ne peuvent pas 
se r^duire k une seule branche dans le cas transitif. Ces deux branches doivent 


O C’est-fc-dire qui tend vers le point quand le temps decroit ou croit respectivement. 

( ,0 ) Pour le dSmontrer j’ai employe non seulenient I'hypotbese de la transitivite mais 
aussi l’hypothdsc qu’il existe au moins un point p6riodique stable non degenere. Pour o'/2.i 
et o"/2jr irrationel L t et £, sont de tel6 mouvements stables dans le probleme restreint des 
trois corps. 

<") Voir III, sections 27-42. 


685 



G. D. Birkhoef: Sur le problime restreint des trois corps 


19 


se couper de la fa^on susdite. En suivant les deux branches adjacentes & l’axe 
des 0 symetriquement on obtiendra un premier point d’intersection Q et un laget 
correspondent PaQoyP (voir la fig. 13). Le point Q sera un point « homocline » 
qui tend vers P quand le temps crolt ou decrolt. Ces branches doivent se couper 
en Q avec un contact d’ordre au moins un et impair. 

On voit ainsi qu’il existe des mouvements symitriques hoinoclines a 
tout mouvement piriodique symilrique du type hyperbolique qui n’admet 
pas de branche double. 

Remarquons en passant que dans le cas d’un mouvement p6riodique syme- 
trique du type stable non d6g6nfere il doit aussi exister des mouvements symd- 
triques voisins qui sont hoinoclines a ce mouvement. En effet il existera dans- 
ce cas des ensembles a et to connexes au 
point fixe, dont les points vont approcher 
au point fixe en tournant en des sens op¬ 
poses autour du point fixe un nombre 
infini de fois (voir la fig. 14). Ces deux 
ensembles seront symetriquement disposes 
et doivent se couper dans le voisinage du 
point Psur Paxe de symetrie en un nombre 
infini de points. 

Done il existe aussi un nombre in¬ 
fini de mouvements homoclincs symi- 
triques dans le voisinage immidiat de tout mouvement p&riodique du type 
symitrique elliptique non-digin&re. 

Les deux cas exclus, a savoir le mouvement hyperbolique avec des branches 
doubles et le mouvement elliptique d£g£n6re, ne peuvent pas se presenter dans 
des problfcines tels que le probtemc restreint des trois corps, a moins qu’un nombre 
infini de conditions analytiques ne soient satisfaites. 

Selon mon Memoir* on peut construire (voir la fig. 13) une bande, remplic 
par des courbes invariantes r^gulteres (mais non analytiques en general) dans 
le voisinage, dont PaQ<oP forme une partie du bord et une des courbes inva¬ 
riantes SJFS en forme l’autre partie. Une telle courbe invariante se coupe le long 
d’une diagonale curviligne QS du quadrilatfcre curvilignc QRST. 

Une premiere propriete est qu'il y a le long de cette diagonale QS un nombre 
infini de points periodiques dont le A:‘* rac (£ — x, x +1,....) circule autour de la 
bande aprds k iterations de T avant de revenir au point de QS. Selon la methode 
employee dans mon Memoire on peut construire une bande symetrique par rapport 
a 0=0 a cause de la symetrie de T, done telle que la ligne diagonale devient 
un segment de Paxe 0 = ;r par exeraple. Done: 

Dans le voisinage immidiat d’un tel couple des a ct at branches qui 
s’etendent jusqu’d un mouvement homocline symetrique, il existe un nombre 



a.'''-. 

'•v 

Fig. 14. 


686 





20 G. D. Birkiioff: Sur le probleme restreint des trois corps [28] 

inf ini d’autres mouvements periodiques symetriques qui circulent k fois 
(IcHy. arbitraire) dans le voisinaye en coupant St k fois avant de renlrer. 

Une propriete plus generate est qu’etant donn<5e une suite arbitraire (ren¬ 
tiers k, /, m,.... »1 existe un point periodique de QRST qui circule succcssi- 
vement autour de la bande dans le meme voisinage tout en rentrant dans QRST 
apr£s k iterations, apres l iterations,.... Pour tout mouvement symetrique de cette 
espfccc la suite correspondante devrait etre symetrique, par exemple de la forme 

* klm.... q • q.... mlk. 

Done, en choisissant une suite d'en tiers qui ne soil pas de ce type syme- 
trique on obtient necessairement des mouvements periodiques qui ne sont pas 
symetriques par rapport k 0 — 0(aft) ou k 0 — n(yb). 

Je dis de plus quo si la bande est suffisament etroite ce mouvement ne pent 
etre non plus symetrique par rapport k a'fi' ou y'd'. En effet a'ft' et y'b' traver- 
seront les bords RaQ, QtoP de la bande un nombre fini de fois en des points 
qui sont differents de Q et des images de Q. Autrcment une image de Q se 
trouverait sur n'ft' ou y'b' et cette image possederait deux points symetriques 
parini ses images et serait done elle-meme periodique symetrique; mais Q est 
un point homocline. Ainsi les images des points voisins de lift' ou y'b' ne sont 
pas toutes k l’interieur d'un tel ruban. 

II est visible done qu’d toutc suite k, /, m,.... t ;• (£x) it corresponded 
au moins un mouvement periodique dans le meme voisinaye ttendn du 
mouvement periodique symetrique hyperbolique considers, et qu’en pte¬ 
nant y. assez grand, un tel mouvement voisin ne sera pas symetrique si la 
suite des entiers choisie ne jouit pas d'unc sy mi trie d'un des types suivnn/s: 

klm.... qq.... mlk ou klm.... 2 q.... mlk. 

Le mouvement piriodique rattachc a la suite klm.... r correspond a une 
trajcctoirc qui circulc dans ce voisinaye 6tendu en revenant d QRST 
aprds k intersections de In surface de section S : , puis apres t intersections, 
et ainsi de suite. 

L’application d’unc modification du dernier theoreme de geometric de Poincaiuc 
nous permet aussi d’etablir des faits analogues dans le voisinage d'un mouvement 
periodique elliptique non degenere ( ,? ): 

II existe un nombre infini de mouvements periodiques symetriques dans 
le voisinage ordinaire de tout mouvement periodique symetrique du type 
elliptique non diyinere. 

Revenons maintenant a la question de 1'existenee des mouvements periodiques 


('*) Voir mon article: A Mete CritrriuH nf Stnhility, Atii «!«•! Congrtssn Intcrnazionalo 
•lei Matcmatici, t. VI, Bologna (1928). 


687 



O. D. BlRKHOFF: Stir le problcme restreint des trois corps 


21 


non symytriques. II reste une seule possibility qu’il n’existc pas de tels mouve- 
inonts pdriodiques, si savoir celle oil le point hyperbolique symytrique considyry 
a seulement des branches doubles adjacentes a l’axe symytrie. Mais dans ce cas 
ces branches doubles se tcrmineront en d’autres points qui sont ygalement des 
points fixes de T k . Si ces points ne sont pas du type symytrique, nous aurions 
trouvy deux mouvements non symytriques. Nous pouvons done admettre que ces 
nouveaux points fixes de T k sont symytriques, et ensuite dyduire analoguement 
que chacun de ces points doit admettre au moins une autre branche double. 
Kn continuant ainsi il faut terminer ce procydy en revenant au point pyriodique 
donny puisque le nombre des points fixes de T k est fini. Mais cela est impos¬ 
sible, puisque en un tel cas 1'ensemble des branches doubles diviserait S en des 
parties invariantes; je suppose comme toujours qu’il y a la transitivity ordinaire. 

Far consequent le nombre des mouvements ptriodiques non symttri - 
r/ues est certainemcnt in fini, Ventier k itant arbitrairement grand. 

12. - Les mouvements recurrent* symetriques et non symytriques. 

Prcnons maintenant un point quclconque de S,. Selon toute probability (dans 
le sens de la mesure de Lebesgue) les ensembles fermys de ses points li- 
mites a et o> remplissent en gynyral toute la surface S t . Pour les autres points 
- spyciaux » tels que les points pyriodiques, au moins un des deux ensembles 
limites constitue seulement une partie de S t . Si le point choisi se trouve sur un 
axe de symytrie tel que afi t les points limites a et les points limites o) jouissent 
de la symytrie par rapport il cet axe. 

Chaque ensemble limite a ou to (symytrique ou non) doit contenir des en¬ 
sembles limites rycurrents, jouissant des propriytys suivantes: tout point d’un tel 
ensemble parcourt tout 1’ensemble rycurrent jusqu'a une distance e pendant un 
nombre N, d'ityrations de T ou de T ~ l , oil I’entier N, ne dypend pas du point 
choisi (•*). 

Si un mouvement symytrique est rycurrent, il est yvident que ses deux en¬ 
sembles limites, ytant identiques, sont aussi symytriques. Mais il ne s’ensuit 
pas que tout ensemble rycurrent symytrique doive contenir un point d’un axe 
do symytrie. 

Nous allons voir combien est compliquye et variye la hiyrarchie des mouve¬ 
ments recurrents symytriques et non symytriques. L’importance des mouvements 
recurrents consiste en ce qu'il y a des ensembles rycurrents parmi les ensembles 
limites a et o> d'un mouvement non rycurrent quelconque, et ce mouvement, quand 
le temps emit ou decroit, peut ytre regardy comme formy d’une syrie d’appro- 
ehements de plus en plus ytroits & tous ces ensembles rycurrents limites to ou a, 
joints it des yioignements correspondents. 


<") Voir V, Cliapitro VII. 


688 



22 


G. D. Birkhoff: Sur le probleme reslreint des trois corps [30J 


Consid4rons done le m£me voisinage €tendu d’un point P qui correspond a 
un mouvement p^riodique hyperbolique avec des branches non doubles PaQ , 
PtoQ (voir la section pr^c6dente). A toute suite doublement infinie d’entiers x) 
jJclm.... il corresponds un ensemble ferm£ de points de QRST (voir la fig. 13 
au-dessus) contenant au moins un point, qui circulent successivement autour de 
la bande aprfes k, l, m,.... iterations de T~ x et aprfcs j, i,.... iterations de T. 

Appelons une telle suite d’entiers, « recurrente • si toute suite partielle finie 
de s entiers qui s’y trouve, s’y trouve au moins une fois dans toute suite partielle 
finie de longueur N t . 

Voici une manure assez generate de construire de telles suites d’entiers r£cur- 
rentes ( 14 )- Choisissons une fonction continue et periodique /(u,,...., u p ) en les 
variables u,u p , qui est reelle avec des periodes 2 n en ces variables et telle 
que + Considerons la suite infinie d’entiers fp„ qui est definie par 

n, l t n, . ). p n) J 

ou le symbole [r] indique le plus grand entier qui n’est pas superieur & r, et 
oil sont des constantes incommensurables avec 2n et sans relation lin&iire 

de commensurabilite. 

Quand on remplace x dans la fonction Xpi) par i + un entier, on obtient 

une fonction analogue. Evidemment les fonctions ainsi obtenues sont en nombre 
infini, et auront pour fonctions limites toutes les fonctions 

W,(*+C,), A,(z + c,). X p (x + c p )) 

oil c, C p sont arbitrages. Nous allons supposer de plus que f(u t ,...., u p ) n’admette 
aucun entier comme maximum ou minimum relatif. Dans ce cas on voit que toute 
suite d’entiers qui se trouve dans la suite 

[/[*,(n + c,),...., A p (n + c,»] 

doit reparaitre uniform^ment, tandis que si cette condition n’est pas satisfaite, 
cela ne doit pas fitre vrai, par exemple la suite definie par la fonction 

/(Mi)—2 —sin* M| 

avec x, —1 est ....1, 1, 2, 1, 1,.... et contiendra 2 une seule fois. 

Plus g6n6ralement, si Ton choisit une fonction f(n) uniform£ment continue et 
presque periodique d’une seule variable qui ne contient aucune p£riode rationelle 


(««) Voir mon Livre V, Cbapitre VII. Morsk a construit un interessant ensemble recurrent 
special d’entiers en employant une methode entifcrement diff4rente de la notre. Voir son 
MSmoire: Recurrent Geodesies on a Surface of Negative Curvature, Transactions of tbe 
American Mathematical Society, t. 22 (1921). 


689 



[31] G. D. Birkhoff: Sur le probteme reslreint des trois corps 23 

et dont aucune fonction limite n’admette un entier comme maximum ou minimum 
relatif dans un sens analogue, la suite [/(n)] sera rgcurrente. 

De telles suites ricurrentes ne peuvent etre purement pfriodiques que 
dans le cas banal ou Von a 

e<f(n) <e +1 

(c Haul un entier). 

En effet supposons que la borne supgrieure de f soit f, et posons *—[/], 
oil e=¥f par suite de la condition imposge. Mais si la suite [f(n)] est pgriodique 
<le pgriode p on aura [f(p + n)] =~[/(n)] pour n quelconque. En laissant tendre n, 
convenablement vers l’infini nous obtenons 

lim/(n,+x)- /•(*) 

ou f(x) est une fonction limite telle que r(0)-f. En prenant »_n f , n„.... dans 
1’gquation fonctionelle de pgriodicitg satisfaite par [f(n)], nous concluons que 
K (/*)] = £ et ainsi e^f*(p)^f. Mais si p est une plriode de /(n), Xp en est 
une aussi. II faut done que pour tout entier X on ait e^r(Xp)3Zf, done e&f(x)&7. 

D’autre part en considgrant la borne infgrieure f' de /*, et en posant e' —(?'] 
ou e'*f\ nous obtenons analoguement f'&f(x)&e' + 1. 

En comparant ces deux rgsultats il est clair qu’on doit avoir e' — e, et ainsi 

c<r^f(x)&7<e+ 1 , 

oil e est un entier. Cela complete ndtre dgmonstration. 

Comme exemple trgs simple d’une suite rgcurrente mentionnons la suite 
dgfinie par 

[i+ sin ’»j (» —0, ±1, ±2.), 

oil la fonction f employge satisfait a la condition susdite. Cette suite rgcurrente 
avec des valeurs 1 et 2 n’est pas pgriodique. 

Ce qui nous intgresse dans une telle suite est l’ensemble des suites limites 
qu’on peut en former. Tous les membres de cet ensemble rgcurrent servent gga- 
lement pour le dgfinir. Par consgquent si toute suite partielle qui s’y trouve, s’y 
trouve aussi en ordre inverse, la suite inverse dgfinit le mgme ensemble rgcurrent 
que la suite donnge. Dans ce cas il est bien naturel de dire que la suite donnge 
est • rgversible •, et dans le cas contraire • irrgversible ». 

Il peut arriver aussi que la suite donnge contient des suites arbitrairement 
longues qui soient symgtriques. Dans ce cas il existera au moins une suite limite 
qui est actuellement symgtrique mais qui sert aussi bien que la suite donnge pour 
dgfinir 1’ensemble rgcurrent. Dans ce cas nous dirons que l’ensemble rgcurrent 
des suites est « symgtrique ». S’il existe k suites limites symgtriques nous disons 
que 1’ensemble est k fois symgtrique. Nous allons voir prochainement comment 


690 



24 


G. D. Birkhoff: Sur le probleme restreint des trois corps 132] 


on peut obtenir des ensembles ^currents de suites qui sont sym 6 triques deux 
ou mSme plusieurs fois. II est bien probable qu’un ensemble recurrent pourrait 
£tre symetrique un nombre infini de fois. 

Notre manure de construire les suites r^currentes moyennant des fonctions 
p 6 riodiques ou presque p 6 riodiques ne nous donne que des suites «« presque p 6 rio- 
diques » en un nombre fini ou meme un nombre infini de pSriodes. L’exemple 
de MORSE est d’un tout autre caractfcre ( ,s ). II serait cxtremement interessant 
d’obtenir d’autres suites r^currentes essentiellement non pdriodiques. 

Aprfcs ces remarques sur les suites r^currentes, etudions de plus prfcs les 
mouvements ^currents dans le voisinage etendu d’un tel mouvement pgriodique 
symetrique. 

En premier lieu si l’ensemble recurrent de la suite que nous choisissons n’est 
pas reversible, il est evident que l’ensemble de mouvements ne peut pas coincider 
avec l’ensemble symetrique. D’autre part si un ensemble recurrent contient meme 
un seul mouvement symetrique, l’ensemble doit etre symetrique. En effet les 
ensembles a et to d’un mouvement recurrent doivent colncidcr avec l’ensemble 
lui-mfime. 

Mais nous ne pouvons pas dire qu’un ensemble recurrent rattache b un en¬ 
semble de suites reversibles doit coincider avec son image symetrique. Ce qui 
est certain est que si l’ensemble en position symetrique ne coincide pas avec 
l’ensemble donne, il corresponds au mfime ensemble de suites. De la m£me 
manidre si l’ensemble de suites est symetrique et ainsi reversible, nous ne pouvons 
non plus le dire. Il est vrai qu’on sait ici qu’il existe des points de l’ensemble 
qui correspondent preeminent aux differentes suites symetriques, mais rien ne 
semble necessiter qu’ils correspondent b un point de l’axe de symetrie. Quand 
un des points de l’axe de symetrie correspond b une telle suite, l’ensemble recurrent 
donne sera 6 videmment symetrique. 

Il peut bien arriver que la suite recurrente cboisie admette deux ou mfime 
plusieurs symetries. Considerons ainsi la suite (<p(n)] ou <p(n) est definie par 
liquation suivante, 

<p(n) — \ + sin* (n — + sin* )[2 (n — 5 ), 


( ,s ) Voir ( u ). En effet sa methode dc definir la suite recurrente /<n) peut etre retnplaclc 
par les equations explicitcs suivantes, 




g(n) »-=> n — 



(n = 0, l,..,.), 


avec f(— n) = /(n — 1 ). Done f(n) se r£duit a 1 ou 2 selon que l’expression dc n en une 
sommo binairo contient un nombre pair ou impair de puissances de 2 . 


691 



[33J G. D. Birkhopf: Sur le probUme restreint des trois corps 25 

par exemple, oh [<p(n)\ ne prend que les valeurs 1, 2, 3. Cette suite sera syme- 
trique par rapport a 0, 1. Mais les fonctions limites de rp(n) sont les fonctions 

+ sin* (n + c, — ^)+sin* n + c t —\ ). 

Par consequent il y a quatre types differents de symetrie, a savoir pour c, — 0, ? 
et c» —0, -. En partant des fonctions analogues plus generates on peut obtenir 
des suites recurrentes qui admettent un nombre arbitraire de symetries de cette 
espfcce. 

Pour arriver a un resultat suffisament general je vais introduce ici l’idee d’un 
ensemble Z recurrent de mouvements (voir IV, chapitre 5). Ces ensembles sont. 
d une importance theorique capitale. Nous definissons un tel ensemble pour le cas 
non periodique de la mantere suivante. Etant donne un ensemble recurrent quel- 
conque de points R, l’ensemble X recurrent correspondant contiendra R aussi bien 
que tout autre point connexe a 1’cnsemble R par des ensembles fermes connexes 
qui ne contiennent aucun point a ou a> asymptotique a un point hyperbolique 
particulier. On peut demontrer que l’ensemble Z recurrent ainsi obtenu est ferine, 
et que ses elements fermes connexes sont transformes entre eux par T et fonc- 
tionnent comme une espfcce de points a presque tous les egards. Pour cette raison, 
nous les appelons Z points. Quoiqu’un ensemble Z recurrent puisse contenir 
plusieurs ensembles ^currents, tous les points d’un seul Z point ne sont guftre 
a distinguer topologiquement entre eux. 

Nous pouvons maintenant faire la demonstration des resultats suivants: 

A toute suite recurrente irreversible d’entiers (i£x) correspondra toujours 
au moms un mouvement recurrent dans le voisinage etendu choisi du point 
hyperbolique. Aucun mouvement de Vensemble recurrent irreversible cor¬ 
respondant ne peut etre symetrique dans ce cas. 

A toute suite recurrente reversible correspondra ou un mouvement recur¬ 
rent d’un ensemble recurrent symetrique de mouvements ou deux mouvements 
recurrents en position relative symetrique. 

A toute suite recurrente symetrique correspondra un mouvement syme¬ 
trique qui est soit un mouvement d’un ensemble Z recurrent symetrique 
soit homocline a un tel ensemble soit het&rocline a deux ensembles Z recur¬ 
rents en position relative symetrique. De plus, le mime ensemble Z recurrent 
symetrique ou le meme couple de tels ensembles en position relative syme¬ 
trique donnent des mouvements symetriques homoclines ou heteroclines 
appartenant a toutes les suites symetriques possibles. 

En effet j’ai montre dans mon Memoire pourquoi il doit exister toujours des 
points de QRST (voir la figure 13 au-dessus) qui correspondent a une suite 
arbitraire d’entiers. Les ensembles limites a ou a> sont rattaches au m§mes en¬ 
sembles des suites limites. Parmi les points limites il existera au moins un point 


692 



26 


G. D. Birkhoff: Sur le probUme restreint des troia corps [34J 

recurrent et parmi les points limites de ce point il existera un point recurrent 
avec la suite rScurrente donn^e. Aprfcs ce que nous avons d6j& dit, on voit que 
les deux premiers rgsultats 6nonc£s sont 6vidents. 

Pour d<5montrcr le troisifcme rSsultat il faut reprendre les raisonneraents de 
mon MSmoire. Choisissons une demi-suite k, l, m,.... quelconque de l’ensemble 
recurrent donn6; a cette suite il corresponds n^cessairement un ensemble connexe 
ferm6, D, qui s’6tend de QR a ST et qui reste dans le voisinage Stendu donn6 

pendant la r£p£tition ind^finie de T~ l t 
tel que tout point de cet ensemble 
rentre dans le quadrilatfcre QRST 
aprfcs k iterations de T~\ puis aprfcs L 
iterations, et ainsi de suite (voir la 
fig. 15). Il existe aussi des ensembles 
analogues C qui s’etendent de QT 
a RS (voir les traits pointilies de la 
fig. 15) et qui correspondent a une 
demi-suite ....i, j quelconque de l’en- 
semble donne. 

Supposons maintenant qu’en par- 
tant d’un tel ensemble C connexe qui 
s’ltend de QT a RS et qui est rattach6 a la demi-suite ....t, j nous faisons 
la transformation par T~ k oil ....j, k est la demi-suite obtenue en ajoutant 
l’entier k a droite. Par le mfime proc<5d6 que j’ai employ^, on voit que l’en- 
semble T~ k (C) contiendra au moins une partie ferm^e connexe qui s’6tend 
de QT a RS et qui correspond d cette demi-suite 6tendue. Cet ensemble se 
trouve done parmi les ensembles qui correspondent a cette demi-suite. Mais il 
aura la propri6t6 que son image par T k se trouve a l’intSrieur d’un ensemble 
rattach6 a ....i, j. 

En ajoutant des entiers successifs admissibles a la demi-suite et en passant 
a la limite, on obtient au moins un ensemble fermg connexe analogue dont la 
suite rattach^e appartient a l’ensemble recurrent donn6, qui est tel que toutes 
ses images en QRST se trouvent sur ces ensembles fermgs connexes qui s’6tendent 
de QT a RS. Il est a remarquer que 1’ensemble limite d’une sGrie d’ensembles 
ferm^s connexes assoctes a de telles demi-suites, qui tendent vers une demi-suite 
limite, doit contenir au moins un ensemble limite qui correspond a la demi- 
suite limite. 

Si l’on fait les transformations T\ T* t _ d’un de ces ensembles fermgs 

connexes correspondent a la demi-suite ....t, j, les ensembles successifs se trouveront 
toujours a l’interieur d’un de ces ensembles ferm6s connexes de la meme espfcce. 
En passant a la limite on voit done qu'il y a au moins un point recurrent L 
qui se trouve dans un tel ensemble. 



693 





(35J G. D. BlRKHOFF: Sur le probleme restreinl des trois corps 27 

Considerons l’ensemble ferme connexe correspondant x aussi bien quo tous 
les autres ensembles fermes connexes de la meme espAce qui contiennent un point 
du mAme ensemble 2 recurrent. Ainsi se trouve defini des ensembles avec des 
proprietes interessantes. En particulicr ils sont transformes entre eux de la manure 
suivante. Un ensemble avec la demi-suite associee est toujours transform^ 

par TJ en une partie d’un tel ensemble analogue associe A ...,h, i; le mfime 
ensemble sera transform^ par T~ k (k etant admissible), en un ensemble ou mArne 
en plusieurs ensembles complets correspondant A la demi-suitc k. De plus 

il existera evidemment au raoins un tel ensemble ferme connexe correspondant 
it une demi-suite donntfe, par exemple A une telle suite quelconque qui constitue 
la moitie d'une suite r^currente symetrique donnAe. Soit X un tel ensemble. 

Mais il y a 6videmment une symdtrie gAometrique complete des deux espfcces 
de ces ensembles fermes connexes. A cause de ceci il existera un ensemble ana¬ 
logue X' en position symetrique. 

Il est Evident que ces deux ensembles x et X' ont au moins un point M de 
l’axe de symetrie en commun et que ce point corresponds it une suite sym6- 
trique k, k, /,.... ou k, est la demi-suite d’une suite recurrente symetrique 
quelconque. 

Supposons en premier lieu que les ensembles 2 ^currents definis par x et x' 
coincident. Cet ensemble est symetrique. De plus le point symetrique M appartient 
A cot ensemble ou bien il en constitue un point homoclinc. En effet faisons les 
transformations successives T\ de X. Les ensembles transform^ successifs 
se trouvent complement a l’interieur d’un ensemble ferme connexe de la memo 
espdee. Si Ton pouvait dAmontrer que ces images tendent uniformtfment vers 
I’ensemble 2 recurrent, la demonstration serait complete puisque les images 
successives de M se trouvent dans les ensembles transformAs. Mais dans le cas 
contraire en passant A la limite on trouverait des ensembles fermes connexes qui 
contiendraient un point de 1’ensemble 2 recurrent et d’autres points. Mais tous 
les points d’un tel ensemble correspondent A une seule suite limite ....i, j, k,.... et 
l ensemble reste dans <2/?S:Tquand il est transforme successivement par T~ k , 
ou par T J , T ( v ... Un tel ensemble ne peut contenir aucun point a ou u> asympto- 
tique A un point periodique, parce que la suite correspondante ne contient aucune 
suite limite periodique. Par definition meme des ensembles 2 reexirrents il resulte 
que de tels points ne peuvent pas cxister, ce qui complete notre demonstration 
dans ce cas. Les modifications necessaires pour traiter l’autre cas sont evidentes. 

Remarqxions ici qu’une analyse analogue peut nous donxxer des rensei- 
gnements prtcieux concemant les branches a et to asymptotiques dc tels 
ensembles 2 recurrents dans le meme voisinage etendu d’un mouvement 
hyperbolique symetrique. 

Remarqxions axissi qxi’au lieu de commencer avec un seul mouvement 
j>eriodiqxie hyperbolique symetrique et un moxivement homocline corres- 


694 



28 G. D. Birkhoff: Sur le problime restreint des trois corps [36] 

pondant qu’y est raltachi, on aurait pu employer deux mouvements hyper- 
boliques non symitriques mais en position relative symitrique et formant 
un cycle de deux elements; on aurait pu mime employer des cycles syini- 
triques plus compliquis. 

13. - Sur la totality des mouvements symdtriques. 

Avant de passer £ la consideration des mouvements plus gdndraux, il est 
intdressant de considdrer la distribution des points pdriodiques d’une ligne de 
symdtrie telle que up. 

L’ensemble E des points p&riodiques d’une ligne de symitrie de S% qui 
ne sont ni hyperboliques avec branches doubles ni clliptiques diginires 
est dense en lui-meme de tous les deux cote's. Done l’ensemble compli- 
mentaire de son ensemble dirivi E' est for mi par des intervalles ouverts 
(s’il y en a) dont les bouts ne sont pas piriodiques sauf peut-itre des 
types spiciaux mentionnis.. 

En effet considdrons un point pdriodique hyperbolique P de op et un lacet 
homocline symdtrique correspondant. Si l’on considdre les images successives 
d’un point W de op prfes de P, par T et par T~ l , deux de ces images, disons 
les entrent les premidres dans QRST. En faisant tend re W vers P il rdsulte 

que k croltra inddfiniment. Done il y a des instants ou les deux k ibmet images 
traversent la ligne diagonale QS et coincident. A un tel instant le point W corres¬ 
pondant est prds de P pour k grand, et pdriodique symdtrique. 

Pour un point elliptique non-ddgdndre il y a toujours des points pdriodiques 
symdtriques voisins comme nous l’avons remarqud. 

Il est ivident que les poitits homoclines a un tel mouveme?it piriodique 
hyperbolique symitrique ou hitiroclines a deux tels mouvements pirio¬ 
diques en position symitrique sont partout denses dans E mais ne 
peuvent pas appartenir aux intervalles ouverts ou meme former une de 
leurs extrimitis. 

Ddmontrons aussi le fait suivant: 

Soient k t , l t des entiers sans facteur commun tels que o / >2tt/,/A:,>«7". 
L’ensemble des points piriodiques de la ligne de symitrie considiric 
pour lesquels le rapport caractiristique a la va/eur donnee l/k — ljkt est 
aussi dense partout dans E. 

En effet il y a dans le voisinage d'un point de E, deux points A, li qui sont 
homoclines a un mouvement hyperbolique choisi. Considdrons la rdgion limitde 
par l’arc co asymptotique (par exemple) et par le segment AB de la ligne de 
symdtrie (voir la fig. 16). Avec l'itdration de T cet arc tend vers le point pdriodique 
hyperbolique P tandis que l’aire de la rdgion reste invariante. 

Le segment AB de la ligne de symdtrie doit etre coupd en des points Q\ 
par toute autre branche oj asymptotique (non double); autrement une telle branche 


695 



[37] G. D. Birkhoff: Sur le probUme restreint des trois corps 


29 


n’entrerait pas ou dans la region ou dans son extlrieur. II ne faut pas oublier 
que deux branches to asymptotiques n’ont pas de 
point en commun. 

Soient maintenant l'Ik' et l"Ik" deux rapports 
caract4ristiques pour lesquels l'/k' <l,/k t <r/k", 

• attaches aux points p4riodiques hyperboliques; 
nous avons d4j& d4montr4 qu’il existe toujours 
de tels entiers caractlristiques. Soient P'toQ' et 
P"u)Q" deux branches 10 asymptotiques corres- 
pondantes (voir la fig. 16). Avec l’it4ration de T, 

O' tend vers P' tandis que la coordonn4e 0 crolt 
avec la vitesse moyenne limite Znl'/k'. La vitesse correspondante de Q" est 
2jiI" fk". Done pour p assez grand, la /> iAme image doit couper la ligne de sy- 
m6trie avec 0 — 2pljr/k. 

Un tel point d’intersection est 6videmment plriodique sym4trique avec rapport 
caract4ristique /,/£,. 

14. - Sur les ensembles Zq. 

Choisissons un point Q quelconque et ddinissons Zq comme Pensemble des 
points qu'on puisse joindre & Q par des ensembles fermds connexes qui ne sont 
pas traverses par la branche n asymptotique ou par la branche <o asymptotique 
choisie ('*). J’ai d4montr4 (loc. cit.) qu’un tel ensemble Zq, qui contient Q, est 
toujours ferm4 et connexe, et ne depend ni du point Q choisi de Zq ni du point 
p4riodique hyperbolique avec lequel on commence. Ilya quatre esp&ces principales 
d’ensembles Zq ou « Z points • que nous classifions selon les branches asym¬ 
ptotiques choisies: 

1) les ensembles Zq sans point des branches asymptotiques choisies; 

2) les ensembles Zq qui se rlduisent & un segment ferm<5 ou a un point 
d’une de ses branches, mais qui ne contient aucun point de l’autre branche. 

3) les points homoclines Q qui appartiennent aux deux branches & la fois; 

4) lc point plriodique hyperbolique P lui-mdne. 

Void quelques-uns des types simples d’ensembles Zq: les points p4riodiques 
hyperboliques sans branche double; les points pgriodiques elliptiques non d4g4- 
n4res; les points homoclines ou h£t4roclines & de tels points; les parties ferm6s 
connexes des branches asymptotiques ou des ensembles asymptotiques corres- 
pondants qui ne contiennent aucun point homocline ou heterocline; les ensembles 
connexes des branches doubles. 

II semble 4tre trfcs difficile de determiner 1’ensemble Zq rattache A un point 
p4riodique olliptique d4g4nfcre. 

(“) Voir la definition dw ensembles 2 re«*urronls, section 12. 


P‘ P" 



696 



30 


G. D. Birkhopf : Sur le problems restreint des trois corps [38] 


Un ensemble Zq (Z point) quelconque est compost des points qu’on ne peut 
gudre s6parer les uns des autres par leurs propriytys topologiques. Ces ensembles 
sont transform^ entre eux par la transformation T et jouissent de la continuity 
supyrieure, c’est-ft-dire que si P tend vers Q y Z P tend vers Z Q ou vers une partie 
de Zq. H est trfcs probable que dans le problftme restreint par exemple on a 

toujours Zq=Q pour ^4=0, 1. . . , 

Remarquons aussi ft cet ygard que les points pyriodiques sont distribuys, d’une 
maniftre dense, par rapport aux ensembles Zq. Ce fait rysulte immydiatement de 
la dyfinition des Zq. 

Si Ton opftre aussi avec les ensembles Zq comme nous 1’avons fait avec les 
points, on peut dyfinir les ensembles Z pyriodiques, les ensembles Z ^currents 
comme auparavant, et ainsi de suite. On obtient ainsi des rysultats uniformes dont 
nous mentionnons par exemple les suivants (voir IV): 

a) . Pour n’importe quel ensemble Z rycurrent il existe toujours des ensembles 
asymptotiques a et <o qui se croisent un nombre infini de fois. 

b) . Les ensembles Z Q relatifs aux Z points sont ces yiyments eux-mfimes 
(Zz ~Zq), et pour les dyfinir on peut employer un ensemble Z rycurrent quel¬ 
conque et ses branches asymptotiques. 

c) . Tout ensemble a ou u> asymptotique rattachy ft un ensemble Z rycurrent 
coupe infiniment souvent tout autre ensemble co ou a asymptotique rattachy ft 
un tel ensemble. 

d) . Tout ensemble fermft et Z connexe qui ne se ryduit pas ft un seul Z 
point contient un nombre infini de Z points de tout ensemble asymptotique d’un 
ensemble Z rycurrent quelconque. Les images successives d’un tel ensemble Z 
connexe, ou par T ou par T~ l sont partout Z denses. 

Remarquons aussi le fait yvident dans notre cas particular: 

L’ensemble des points pyriodiques symttriques est nicessairement Z 
dense partout. 

Ceci rysulte du fait qu’on peut employer deux branches a et co d’un tel point 
hyperbolique pour dyfinir les Z Q et du fait qu’il existe d’autres mouvements, 
pyriodiques symytriques dans le voisinage immydiat de tout point homocline. 

Nous concluons aussi que le rysultat suivant a lieu: 

Les intervalles ferm6s lacunaires d’une ligne de symttrie ( s'il y en a) 
doivent chacun appartenir a un seul ensemble symHrique Zq ; et ces Zq 
sont tous distincts. Par consequent un tel intervalle n’admet pas de 
points homoclines ou htteroclines meme par rapport a des ensembles Z 
r 6currents. 

A cet ygard remarquons seulement que si deux intervalles correspondaient au 
m§me Zq , cet ensemble symytrique diviserait en deux ou plusieurs parties 
invariantes, ce qui contredirait l’hypothftse de transitivity. 


697 



[39J G. D. Birkhopp : Sur le probUme restreint des troia corps 31 

15. - Sur la signature. 

En partant d’un point P fixe periodique hyperbolique de T k avec au moins 
deux branches adjacentes non doubles nous avons obtenu un laget PaQcoP 
oil Q est un point homocline (voir la fig. 13). J'ai appeie la figure formSe par 
l’entrelacement complet de ces deux branches, ou plutdt par une figure topolo- 
gique equivalente & celle-ci, la « signature » S du systfcme dynamique (voir IV, 
chapitre V). Pour determiner cette figure U suffit de connaltre l’entrelacement 
d une seule branche complete telle que Paoo avec l’autre branche partieUe Qa>P. 



En effet, une figure topologique mfime beaucoup plus simple suffit & caracteriser 
I’entrelacement. 

Cet entrelaeement determine complement lea relations topologiques 
des 2 points entre eux. Puisqu’il est bien probable que dans le cas general 
les 2 points se reduisent & des points ordinaires, on peut concevoir l’importance 
fondamentale theorique d’un tel symbole S. 

A vrai dire, la signature ainsi obtenue caracterise la transformation T k (A>1) 
plutet que T t puisque T n’admet pas des points fixes hyperboliques dans le cas 
du probieme restreint (‘’). Done en considerant une telle signature, on ne considfcre 
pas precisement le probteme restreint. Pour le probterae modifie, on remplace 


< n ) Les deux bords de S t fonctionnent comme des 2 points fixes elliptiques de la 
transformation 7*. 


698 




32 


G. D. Birkhofp : Sur le probUme restreint des trois corps [40] 

respace S 3 des 6tats de mouvements par un espace Sjf* obtenu en rempla«;ant S : , 
par une surface de Riemann k trois dimensions dans S 3 avec k feuiilets dans 
laquelle L, et L% sont les deux lignes multiples. On voit qu’une telle signature 
de T k ne donne pas de complets renseignements topologiques pour le probl&me 
donn6. 

A cause de la 6ym6trie de T k on peut choisir un entrelacement qui est sym6- 
trique (voir la fig. 17). Examinons un peu ce que cela signifie du point de vue 
topologique. 

Une premiere manifcre de donner le symbole est la suivante (*•). Nous associons 
les points de l’arc Qo(oQ t oil Qi — T k (Q) aux nombres rSels X entre 0 et 1 d’une 
manure biunivoque et continue; et nous associons k Q^ t aQ 0 les nombres p ana- 
loques entre —1 et 0 de fa<;on que pour les points symdtriques on ait A + — 0. 

Puis nous num^rotons un point de Q t <oQ t entre Q { et Q . qui est le point 

transform^ T k (Q x ) d’un point Q x de Q 0 wQ t en l’associant au nombre ^ + 1; puis 

nous associons les points T tk (Qx) de Q 3 ojQ 3 k X + 2, et ainsi de suite. Nous 
pouvons numSroter Q-iioQ 0 , Q-t<oQ- de la mime manifcre en prenant X negatif. 
De cette faQon nous associons k la branche asymptotique Piox les nombres X 
entre — oo et + oo; et nous associons analogucment k Pax les nombres p 

entre — oo et + oo d’une maniSre symStrique, telle qu’en des points symStriques 

de Pax et Pojx on a toujours X+p — 0. Done il y a un couple particular (X,/t) 
correspondent k tout point homocline. 

C’est cet ensemble des couples (i, p) qui correspond aux mouvements 
homoclines, oil Vordre relatif des X et des p est seul important, qui deter¬ 
mine la signature S ik) et Ventrelacement des branches asgmptotiques (**). 

II est Evident que si p) est un tel couple, (X+l, p + 1) Test aussi, et il y 
a d’autres lois analogues. 

Remarquons maintenant que dans le cas symltrique les deux systfcmes de 
couples (X, p) et (—p,—X) sont essentiellement identiques, ind^pendamment de 
la m^thode de numSrotage. 

La signature S ik) de T k dans le cas du probUme restreint possdde la 
propriety sptciale de rester essentiellement la me me quand on renversc 
Vordre des deux paramHres X et p avec changement de signe. 

Inversement si une telle propri6t£ a lieu pour un systfcme dynamique il jouira 
certainement d’une symgtrie topologique par rapport au 2 points, sinon d’une 
sym^trie analytique. 

Comment peut-on utiliser la signature S ik) ? A vrai dire, la signature nous 
donne un instrument th^orique qui nous permet de rSpondre a toute question 
sp£ciale. 


(•») Voir IV. 
( ,? ) Voir IV. 


699 



[41] G. D. Birkhoff: Sur le probUme restreint des trois corps 33 

Par exemple, on peut se demander, s’il existe un point fixe de T k k l’inte- 
rieur du lacet donn6 PaQcoP (voir la fig. 17). En faisant parcourir un point L 
autour du lacet (m§me un peu k l’int£rieur prfcs de P), son image parcourra 
1’image PaQicoP, et l’on voit que dans la situation indiqu£e Tangle du vecteur L 
est augments de 2 tt. Ce rgsultat depend seulement du caract&re topologique parti- 
culier de S ik) . Done il existera au moins un tel point fixe. 

Donnons un autre exemple bien different. On peut se demander s’il est pos¬ 
sible de regarder T k come le produit de x facteurs T dont chacun a un point 
fixe au point P. Pour cela il faut gvidemment qu’il existe pr£cis6ment xl— 1 
couples (A t , avec 0<^/, /i|<l. Il faut plus glndralement qu'on puisse choisir 


T 3 (S) 



les paramdtres A et p de fa$on qui si (a*, p { ) est un des couples, le couple 
(Ai+l/x, pt+l/x) en est un autre. 

Il faut bien remarquer que nous n’avons consider que les invariants topo- 
logiques du problfcme restreint dans cette deuxifcme partie de ce Mlmoire. Ceux-ci 
sont extrdmement nombreux et varies. Mais e’est seulement la signature qui semble 
nous permettre de dlgager tous les invariants topologiques indgpendants. 

En conclusion nous voudrions indiquer comment on pourrait dgfinir une 
signature S correspondent k la transformation T au lieu de T k (&>1). 

Pour cela, partons encore avec le point fixe hyperbolique symgtrique de T}\ 
et marquons les points P t P, — 7’(/ > ),...^ Pic-t qui sont sym€triquement disposes 
par rapport k Taxe de symgtrie. Dans ce qui suit nous prenons 5. 

En commen^ant avec deux points symgtriques Q et Q des arcs sym6- 
triques Pa to et Putoo qui sont prfcs de P, prolongeons ces deux arcs sym6- 
triquement. Soit QQ la courbe bris£e form£e par ces deux arcs symgtriques. 


700 



34 


G. D. Birkhoff : Sur le probleme restreint des trois corps (421 

Considerons en mdme temps lesarcs transform^ Q t Q it Q*Qt, Q-iQ-i et Q-tQ-t- 
Si Q vient k coincider avec Q t tous les quatre couples (Q_t. Q-i) t (Q-i> Q )* 
(Q, Qi), ( Qt> Qt) viennent k coincider en S, T(S ), T*(5), T 3 (S) respectivement 
(voir la fig. 18). Maintenant nous prolongeons P t aoo jusqu’h sa premiere ren¬ 
contre avec en un P oint W, tous les deux arcs etant prolong^ sym6- 

triquement. Ainsi le point W se trouve sur l’axe de symetrie. 

De cette manidre nous obtenons un cycle sym6trique rattache aux cinq points 
periodiques P , P tt P%, P- it P-t • Pour specifier un tel cycle topologiquement il 
faut en premier lieu determiner les autres points d’intersection des arcs indiqu6s; 
pour p assez petit le cycle doit avoir la forme spSciale de la figure, puisque 
pour p —0 le cycle se reduit k un cercle const. L’entrelacement total sera 
determine seulement par l’entrelacement de la branche complete Paco avec les <o 
branches du cycle; et mfime cet entrelacement partiel obeit aux lois presque 
evidentes que je ne donnerai pas ici. 

L‘entrelacement de la branche compute Paoo avec un cycle de cette 
espdce d6terminera ainsi la signature S assocUe a la transformation T 
du veritable probl&me restreint des trois corps. 

Je n’essaierai pas & en etudier les proprietes ici. fividemment la specification 
topologique d’une telle signature S devrait etre beaucoup plus compliquec que 
celle de la signature partielle S ik) que nous avons employe plus haut. 


III. 

Quelques reflexions generates. 

1. - Signification matheinatiquc du probleme restreint des trois corps. 

En general un systeme dynamique irreversible de deux degres de liberte 
n’admet pas de transformations en lui-meme autre que l’automorphismc interieur 
obtenu en remplagant t par f-fc. Mais il peut aussi admettre un groupe fini 
d’automorphismes ext6rieurs. C'est ce qui arrive dans le cas du probleme restreint 
des trois corps, oh l’on peut remplacer x, y, t par x t —y, —t respectivement 
sans modifier les equations differentielles ou ineme la constante C qui apparalt 
dans l’integrale de Jacobi. 

Nous avons consider seulement le voisinage du cas integrable /<«*0. Mais 
cette limitation nous a ete utile principalement pour faire la reduction effective 
du probleme restreint k celui d'une transformation conservative T d'une surface 
de section en elle-meme. Je doute tres peu que cette espece de reduction reste 
valable dans un domaine etendu de valeurs C et de p. Malheureusement je n’ai 
pu trouver d’autre methode de eonstruire une telle surface S? sauf celle de la 
continuation analytique en commen?ant par le cas integrable p = 0. Quand une 


701 



[43J G. D. Birkhoff: Sur le problbre restreint des Irois corps 35 

telle reduction cesse d’etre possible, le probteme restreint devient beaucoup plus 
complique du point de vue mathematique. 

2. - La mdthode du niiniiiiuni de M. Tonelli. 

Sans aucun doute le Calcul des Variations nous fournit Ies formes analy- 
tiques des probtemes dynamiques les plus suggestives. En les variables x, y, t 
une forme paramfctrique du probteme restreint est la suivante: 

* jW - y*') + nU(x,y)-C f*' + y”)dt-0. 

Ici le premier terme sous 1c signe d'intlgration r6pr^sente le double de l’airc 
algebrique A d’un secteur dont les bords sont les deux rayons qui sont issus de 
I'origine et qui passent par les points {x 0 , y 0 ) et (*,, y t ) de la courbe qui joint 
ces deux points; cette aire est mesuree positivement dans le sens qui amfcne l’axc 
positif des x sur l'axe positif des y. L’autre terme r<5pr6sente une espfcce de 
longueur modiftee de la courbe consider que nous designerons par L. Done 
le probteme se pose de la manure intuitive suivante, d(2A + L) — 0. 

La fonction sous le signe d’integration est tout a fait regultere dans le sens 
de M. Hilbert et de M. Tonelli (*•), sauf a I’origine et le long de la courbe 
de vitesse nulle, oil elle cesse d’etre analytique. Done 1* integrate jouit de la semi- 
continuity inWrieure sans autre exception, et les raisonnements classiques de 
M. Tonelli nous permettent de voir presque g£om£triquement pourquoi les 
extr£males ordinaires nous donnent un minimum relatif quand les points (*«>, y 0 ) 
et (x t , y t ) ne sont pas trop eioignes l’un de 1’autre. 

En employant les variables regularisantes p, q, r oil x-p* — q\ y — 2pq, 
dt — 4(/>* + q*)dr, on obtient une forme analogue 

* / (<P* +q'Hpq' - qp') + \p i Tq t fflHp'-q*, 2p?) —dp' t +q*)dx-0, 

qui jouit de l'avantage que Pintegrate nouveUe est regultere a I’origine et done 
partout a Pinterieur de Povale de vitesse nulle consider. 

Malheureusement la methodc de minimum ne s’applique a aucune des trajec- 
toires fermees du probteme restreint parce que l’integrale n’a jamais un mi¬ 
nimum relatif le long d une telle trajectoire; e’est pourquoi l’analogue du cri- 
terium de Whittaker (- 1 ) que j’ai donne pour le cas irreversible et dont 


(*"> Vo ' , ‘ ,0 livrp de M. Tonelli: Fondamenti di Calcolo deHe Variazioni, t. l cl 2, Bo¬ 
logna (1022-11)24). 

(•’) Voir .-on article: On Periodic Orbits, Monthly Notices of the Royal Astronomical 
Society, t. 02 (1002). Le criterium £tait rigoureusement etabli par M. Tonelli et M. Sionorini 
•• n 1012. \oir Particle dc M. Tonelli: Suite orbite periodicke, Rendiconti della R. Accademin 


702 



36 G. D. Birkhoff: Sur le probUme restreint des trois corps [44] 

M. Tonelli a donnd une trds simple demonstration (**) ne s’applique pas 
non plus. 

Faut-il done renoncer si l’espoir d’obtenir par la mdthode du minimum toute9 
les trajectoires fermdes ? Je crois que e’est seulement en formulant des probldmes 
isopdrimdtriques convenables qu’on puisse employer la mdthode du minimum 
jusqu’ft un certain point. Par exemple considdrons les courbes rectifiables fermdes 
sans points doubles d’une surface analytique convexe ft deux dimensions dont 
chacune divisc la surface en deux parties dc courhure totale iyale d 2,*r. 
La plus courte de ces courbes doit dtre une gdoddsique fermde. C’e9t ft PoiNCAitfc 
qu’on doit cette mdthode isopdrimdtrique (**). 

Peut-on employer une mdthode isopdrimdtrique analogue dans le probifeme 
restreint? Pour rdpondre ft cette question je vais employer une analyse (voir II, 
section 16) basde sur l’usage des arcs brisds d’extrdmales. En choisissant une 
suite cyclique de points convenables P t ,...., P„ situds respcctivement sur n courbes 
transversales, dont aucun couple adjacent (PiP t + t ou P n P%) n’est sdpard par 
deux points conjuguds de l'extrdmale correspondante, on obtient 

oil l’intdgrale I est prise le long d’une courbe brisde d’arcs d’extrdmales P t P t . 

P n Pi dans le voisinage et ou x,,...., x„ rdprdsentent les distances des sommets 
P, .. P n de l’extrdmale fermde donnde. 

Dans le cas d’un minimum relatif la forme quadratique serait positive ddfinie 
ou semi-ddfinie. Un examen dc 1’intdgrale I dans le cas intdgrable — 0 montre 
immddiatement que le cas d’un minimum ne peut pas se prdsenter ici, puisqu’il 
y a toujours au moins trois points conjuguds le long d’une telle courbe fermde. 

Si Ton impose une telle condition isopdrimdtrique naturelle, cela introduit 
une relation de plus, 

F” V] fiiXt+ .... —0. 

En faisant ainsi, on diminue l’indice dc la forme quadratique d’un an plus. Done 
en n’introduisant qu’une seule condition isopdrimdtrique convenable, on ne pourra 
traiter que le cas d’indice 1. Mais j’ai montrd que ce cas a lieu prdcisdment s’il 
y a deux points conjuguds au plus le long de la courbe complete ( 24 ), ce qui n’a 
pas lieu mdme pour p petit. 


dci Lincei, t. 21 (1912) ct aussi Particle dc Siosorini: ExUtenza di un'cat remote chituta 
dentro un conlorno de Whittaker, Rendiconti del Circolo Matematico di Palermo, t. 33 (1912). 

( ;? ) Voir son article: Suite orbite periodichc irreversibili , Memorle della R. Accadcmin 
delle Scicnze dell' Islituto di Bologna, sdr. 8, t. 1 (1923-1924). 

(°) Sur les liffnes gfodesiques des surfaces con vexes. Transactions of the American 
Mathematical Society, vol. 6 (1905). 

(**) Voir II, section 19. 


703 



[45] G. D. Birkhoff: Sur le probleme restreint des trois corjfs 37 

II faudrait done imposer au moins deux conditions isop6rimetriques pour arriver 
au but dans quelques cas sp^ciaux. On pourrait peut-Stre n’imposer qu’une seule 
condition en demandant de plus que les courbes admissibles soient symetriques 
par rapport h l’axe des x (ce qui revient & diviser l’indice par deux & peu prfcs). 

Mais, pour C grand, I’indice, qui est toujours donne par le nombre de points 
conjuguds correspondant a une p^riode complete, devient grand aussi. Done il 
semble fit re n<5cessaire d’employer des conditions isoperimfctriques de plus en plus 
nombreuses meme pour obtenir la trajectoire fermee retrograde fondamentalc L, 
dont j'ai demontre l’existence en general (voir I, section 18). 

La rodthode entterement differente que j’ai developpee pour deraontrer l’exis- 
tence d’unc telle trajectoire L t est bas£e sur quelques proprietes sp6ciales de la 
fonction 0(x, y). J’ai montre de plus que si toutes les autres trajectoires circulent 
indefiniment dans le meme sens autour de L x dans l’espace S 3 il doit exister 
une autre trajectoire fermee L . et une surface de section ayant comme seul bord 
la trajectoire L x ; j’ai m£me formula une condition analytique pour cette espfccc 
de circulation. 

Tout cela suggOre que probablement il est presque n^cessaire d’employer des 
mdthodos plus particulifcres que celle du minimum pour bien reussir dans la 
determination des trajectoires fermC-es du probleme restreint. 

Faisons ici une remarque qui puisse Iclaircir la situation. Le groupe de trans¬ 
formations ponctuelles & deux dimensions qui changent x, y en x, v est celui qui 
sc rattache au problftme du Calcul des Variations que nous considSrons dans le 
probleme restreint. Mais le groupe fondamental pour le probteme restreint est 
plutdt celui des transformations ponctuelles & trois dimensions puisqu’on peut 
changer librement les trois variables dans les trois Equations differenticlles du 
premier ordre. Par consequent la plupart des invariants du groupe special du 
Calcul des Variations n'ont aucune signification pour ce groupe plus fondamental. 

N<?anmoins le Calcul des Variations restera toujours sans aucun doute un des 
instruments les plus puissants et les plus beaux dans le domaine des equations 
differentielles de la Dynamique. En particular il nous permet de resoudre le 
probleme du minimum relatif pour des arcs courts et ainsi de reduire les questions 
' in the large • £ des questions de points critiques d’une fonction ordinaire. 

3. - Ln method? des points critiques. 

En introduisant une certaine fonction de p*, 0*, p, C qu’on peut calculer, nous 
avons vu comment on peut reraplacer la consideration des trajectoires fermees 
qui font le tour de L, et L t k fois avec une avance angulaire de 2frr, en consi- 
derant les points critiques d’une fonction de 2Ar variables independantes p*, 0 *, 
e,* # 0,p\_,, 0\_,, quelque soit k. PoiNCARfi a fait il y a longtemps une 
telle reduction locale du probleme restreint en n’employant que deux variables 
comme nous avons remarque au-dessus. 


704 



38 G. D. Birkhoff: Sur le probleme restreint des troia corps [46] 

II serait trfcs int6ressant d’appliquer les inygalitys fondamentales de Morse 

a I'dtude des points critiques de ces fonctions pour k—1, 2. N6anmoins il me 

semble presque certain qu’on n’obtiendrait ainsi que des r^sultats assez dvidents. 
D'ailleurs, aprfcs notre ytude directe de la transformation T, il serait probablement 
tr^s difficile d’obtenir de cette mantere nos r^sultats concernant la distribution 
asymptotiques des trajectoires ferm6es qui dependent essentiellement de l'ytude 
asyraptotique des points fixes de la transformation T ( ,5 ). 

Mais c’est dans les problSmes dynamiques, ou & deux degr6s de liberty sans 
surface de section, ou & plusieurs degr<5s de liberty qu’on peut sans doute 
employer avec grand avantage la mythode de Morse pour d^couvrir toutes les 
trajectoires ferrates qui sont topologiquement nficessaires. Par exemple MORSE a 
d^jft d6montr6 l’existence de n(n —1)/2 g6od<5siques ferules, des mSmes types 
respectivcment que les g£od6siques ferm^es principales d’un ellipsoide, sur toute 
surface hom€omorphe & la surface d'une sphere a n dimensions ( ?# ). 

En effet la mythode du Calcul des Variations «in the large » de Morse ( ,1 ), bas(Se 
sur l’ytude des points critiques gdn6raux,et la mythode d’une ytude asymptotique que 
j’ai employee sont en quelque sorte des mythodcs compltfmentaires dans ce domaine. 

4. • Sur le th^orfcme ergodique (*•). 

Du point de vue purement topologique, ce trfcs reraarquable thyorfcme n’exprime 
autre chose que le thSordme de recurrence de PoincarS. Mais si Ton ne considfcre 
que le groupe des transformations continues qui conservent les volumes (comme 
on le fait par habitude dans la mecanique statistique par exemple), ce theorfcme 
est d’une importance de tout premier ordre. D’aprfcs ce th<5or£me, dans le problfcme 
restreint toutes les trajectoires, sauf peut-ytre celles d’un ensemble de mesure nulle, 
traverseront successivement, toutes les regions de la surface de section et 
m6me tout ensemble mesurable de S, t avec une probability limite qui est la mfime 
dans le passy (t dycroissant) que dans l’avenir (/ croissant). 


( S1 ) Pour les problfcmes dynamiques avec n degr$s de liberty voir it cet ygard un article 
de D. C. Lewis et moi-mymc: On the Periodic Motions Near a Given Periodic Motion of 
a Dynamical System, Annali di Mateinatica, $yr. 4, t. 12 (1933-1934), et aussi un article de 
Lewis: On Certain Periodic Motions of Dynamical Systems with .More Than Two Degrees 
of Freedom, American Journal of Mathematics, vol. 5, 6(1934). 

(*«) Voir sa note: Closed Extremals, Proceedings of the National Academy of Sciences, 
t. 16 (1929). J’avais dymontre antyrieurement avec ma mythode de « minimax » qu’il en existe 
toujours au moins une (voir V, chap. 7); la mythode de minimax est celle des points criti¬ 
ques d*indice un seulement, tandis que Morse considdre le cas des indices arbitraires. 

(*") Voir son Livrc: The Calculus of Variations in the Large, New York (1934). 

(**) Ma dymonstration du thyorfcme ergodique se trouve dans le Proceedings of the Na¬ 
tional Academy of Sciences t. 17 (1931). Une note historique par Koopman et moi-myme: 
Recent Contributions to the Ergodic Theory, y a paru en t. 18 (1932). 


705 



[47] G. D. Birkhopf: Sur le probUme restreint des trois corps 39 

Remarquons seulement ici que nos recherches de plus haut montrent clairemcnt 
que la propri6t<5 ergodique n'est pas valable, mdme dans une seule direction du 
temps, pour des mouvements extrSraement nombreux. En effet pour les mouvements 
hytyroclines rattach^s aux mouvements pyriodiques cette propria est valable dans 
les deux directions du temps s6par$ment. Mais si Ton ycrit une suite doublement 
infinie d’entiers born^s ....ijklm.... dans laquelle deux suites finies particulteres 
se trouvent de plus en plus r£p£t6es, il est Evident que le mouvement corres¬ 
pondent (voir section 10, partie II) n’entre pas en g6n6ral avec une probability 
determinde dans un voisinage choisi des deux mouvements (pyriodiques) obtenus 
par une rypdtition indyfinie des deux suites. Pour un tel mouvement la propriyty 
nc peut pas ytre valable m6me dans une direction du temps. D’autre part, si Ton 
ycrit une telle suite par hasard (si Ton peut en admettre la possibility) les mouve¬ 
ments correspondents satisfont au thyorfcme ergodique. 

5. - Sur les fmnilles dc mouvements pyriodiques. 

G. Darwin, F. R. Moulton, E. Stromgren (*’) et d'autres ont fait des calculs 
numyriques trfcs ytendus afin de suivre les families analytiques de mouvements 
pyriodiques symytriques les plus simples en laissant varier C et p. Wintner ( ,0 ) a 
essayy rycemment de classifier et d’organiser les rysultats ainsi obtenus moyennant 
<les principes gynyraux en particulier le principe de la «termination naturelle - 
des families pyriodiques, ynoncy par Stromgren et dymontry par Wintner. Voiia 
une tflche d’une complexity formidable si I’on voulait suivre ces mouvements 
pour toutes les valeurs de C et p. 

A cet ygard je voudrais faire les remarques suivantes. Par famille pyriodique 
nous entendons tous les mouvements pyriodiques pour des valeurs belles diffy. 
rentes de C et pour p donny (0</<<l) qu’on peut obtenir par une extension 
analytique, en regardant C comme une fonction uniforme analytique d’un para- 
mfctre ryel quelconque r. Avec cette dyfinition deux mouvements pyriodiques qui 
coincident et puis disparaissent, appartiennent a la m^me famille fermye dans le 
voisinage. Un mouvement pyriodique est considyry ici comme porcouru un certain 
nombre de fois. 

Maintenant supposons qu’une telle branche se termine pour C—C 0 . Supposons 
de plus qu’en un paramfctre quelconque r, on ait lim C— C 0 pour lim r—r 0 , et 


( :j ) Voir par cxeiuple son article: Forms of Periodic Motion in the Restricted Problem 
und in the General Problem of Three Bodies. Publikationer og mindre Meddelelser fra 
Kobenhavns Observatorium, N.• 39 (1922). On y trouvera d’autres articles plus rScents par 
StrOmorem et ses Steves. 

i 30 ) Voir le plus recent de ses articles oil I’on trouve des i-yterences bibliographiques: 
Grundlagen einer Genealogie der periodischen Bahnen im restringierten Dreikorpers Problem , 
Mathematischc Zeitschrift, I ct II, t. 34 (1931). 


706 



40 G. D. Birkhoff: Sur le problems restreint des trois corps [48) 

qu’en mgme temps, les dimensions maximum D de la courbe de mouvement ou 
bien la pgriode <9 ne restent pas borages, ou bien encore la distance maximum d 
de la courbe de mouvement de S ou de J ou d’un des cinq points de libration 
pour lesquels Q X = Q U — 0 devienne arbitrairement petite. 

Dans ce cas il est presque gvident qu’on doive avoir 

lim D+&+ — + oc. 

r-ro d 

Autrement ou aurait pour r prfcs de r 0 

\D\, |0| <K. |<<l > jf- 

Les thgorgmes glgmentaires d’existence des solutions nous montrent que cela est 
impossible. 

En effet supposons en premier lieu que C 0 ne soit pas unc des cinq valeurs 
de C pour lesquelles l’gquation 2 Q—C 0 est valable ft un point de libration. II 
s’ensuit que les dimensions du mouvement ne sont ni grandes ni petites et quo 
la pgriode est borage. Done il existera au moins un mouvement pgriodique limite 
ayant les mgmes proprigtgs. Mais en employant une surface de section locale, on 
voit tout de 9uite que les conditions de pgriodicitg s’gcrivent 

<P((, n, C) -0, y(& v, C) -0 

ou £, y sont des coordonnges de la surface de section et ou I’on a pour r— r 0 

9,(0, 0, Co)-0, V '(0,0,Co)«0. 

Pour de telles gquations, les solutions, £(C(r)), ,;(C(r)) que nous considgrons 
dans le voisinage se groupent en des families analytiques rgelles qui ne se terminent 
jamais pour C— C 0 selon les thgorgmes bien connus des fonctions implicites. 

D’autre part si C a une des cinq valeurs exclues et le point ( x, y) vient dans 
le voisinage d'un point de libration correspondent, ce point se trouve prgs d’un 
point d’gquilibre dans l’espace S t des x, y, x\ y\ Dans ce cas le point y restera 
pendant longtemps mais sans y rester toujours, puisque d n’est pas petit. Done 
la pgriode devrait gtre trds grande, ce qui contredit notre hypothdse. 

Prgcisgment le mgme raisonnement nous montre aussi que si I) et (9 sont 
borngs et d n’est pas petit, pour r voisin de r 0 , la branche considgrge ne se 
termine pas non plus pour C=-C 0 . 

Par consequent si une branche periodique rielle se termine elle sc ter¬ 
mine naturellement de fa^on que 

D + G+ 1/d — + oc. 

C’est dans ce sens presque intuitif que j’interprfcte le « principo de termina¬ 
tion naturelle » de Stromgren et Wintner. 


707 



41 


[49] G. D. Birkhoff: Sur le problems restreint des trois corps 

Observons que la possibility des mouvements pyriodiques isotes ou des families 
pyriodiques pour C— C 0 , n’est pas exclue par ce raisonnement. 

II est tout ft fait yvident que les mythodes que nous avons employ6es suffi- 
sent a donner une trfts bonne idye de ce qui est vraiment essentiel dans tous les 
r^sultats de calcul numyrique, au moins s’il existe une surface de section S t . 
Par exemple supposons qu’un mouvement pyriodique symytrique passe par deux 
points de rebroussement en position symytrique sur la courbe de vitesse nulle, 
et ainsi acquire ou perde deux points doubles en position symytrique. Cela ne 
signifie que le fait suivant: La courbe de vitesse nulle est reprysentye par une 
courbe fermye de S 3 . Suivon6 dans les deux sens du temps, les trajectoires passant 
par un point quelconque de cette courbe particuliyre de S 3 , jusqu’ft la premiftre 
rencontre avec S t - On obtient ainsi deux courbes correspondantes C' et C" en 
position symytrique, qui se croisent deux fois sur l’axe. En effet les deux courbes 
se croiseront aux deux points de l’axe des 0 qui correspondent aux deux ytats 
de mouvements x —x,, y — 0 , x' — y' — 0 et x—x,, y — 0 , x / — y' —0 oil (x,, 0) 
et (x 2 , 0) sont les deux points de l’axe des x sur la courbe de vitesse nulle. Done 
il n’y a rien de spycial ft l’instant considyry sauf le fait suivant; deux images 
symytriques P' et P" du point pyriodique se trouvent ft cet instant, l’une sur C\ 
1’autre sur C'\ en passant d’un c6ty ft l’autre. 

D’une maniftre semblable le « choc » du point P de masse nulle avec 5 ou ,J 
indique seulement qu’un des points pyriodiques se trouve sur la courbe p — 0 de S t . 

Des telles recherches numyriques ont un intyrftt mathymatique considerable. 
En effet les rysultats obtenus suggyrent des thyorftmes possibles. Par exemple j’ai 
montry (voir IV, chapitre I) qu’il faut pour 1’existence d’une surface de section 
quelconque, que la condition suivante ait lieu: si Ton prend deux surfaces non 
tangentes ft aucune trajectoire et qui soient traversyes par une certaine trajec- 
toire ft P et Q, et si 1’on varie avec continuity cette trajectoire sans que P ou Q 
dypasse les bords, l’intervalle de temps entre les deux points de croisement ne 
peut pas devenir infini. En consyquence la pyriode d’un mouvement pyriodique 
symytrique ne peut pas devenir infinie pendant qu’il existe une surface de section. 
Done si les calculs nous montrent que la pyriode d’un mouvement symytrique 
pyriodique devient infinie pour une certaine valeur de p et C, il faut conclure 
qu’il n’existe ft ce moment aucune surface de section. Dans ce cas particular 
les deux surfaces (qui peuvent ytre identiques) sont dyfinies par l’yquation x' — O. 

Il serait aussi trfts intyressant de calculer des syries explicites pour les fonc- 
tions p, et 0, qui dyfinissent T. Ainsi Ton dyterminerait ft peu prfts quelques-unes 
des images des axes de symytrie, et l’entrelacement de quelques branches asym- 
ptotiques. On obtiendrait done des faits nouveaux concernant la distribution des 
mouvements symytriques et concernant la signature S. 

Moyennant de tels calculs, peut-on trouver une ryponse ft d’autres questions 
intyressantes comme celle de la stability? Je crois que non, ft moins qu’on ne 


708 



42 G. D. Birkhopf: Sur le probUme restreint des trois corps [50] 

faaee des calculs vraiment prodigeux. En effet dans le voisinage d’un mouveraent 
pdriodique formellement stable tel que L t ou L, t les series asyroptotiques, quoique 
divergentes en gdnlral, nous donnent le seul raoyen effectif de calculer les mouve- 
ments voisin6 pendant dea intervaUes de temps trfcs longs. Mais c’est le caractfere 
de cea mouvements voisins pendant de tels intervaUes qui determine la stability 
ou l’instabilitd du mouvement p^riodique. La difficult^ d’un calcul direct pour de 
courts intervaUes successifs serait presque inconcevable. Ndanmoins l’emploi de 
ces series doit dtre dgfendu, puisqu’U suppose d’avance que la stability ait lieu. 


709 



Reprinted from Science, December 28, Vol. 94, No. 2452, pages 598-800. 


SOME UNSOLVED PROBLEMS OF THEORETICAL 

DYNAMICS 1 

By Dr. GEORGE D. BIRKHOFF 
pebkins paonssoa or mathematics at harvard cnivtasitt 


As wus first realized about fifty years ago by the 
great French mathematician, Henri Poincar^, the 
study of dynamical systems (such ns the solar system) 
leads directly to cxtraordinorily diverse and important 
mathematical problems in point-set theory, topology 
and the theory of functions of real variables. 

On the other hand, the abstract point of view empha¬ 
sized by the foremost American mathematician of the 
same period, E. II. Moore of the University of Chi¬ 
cago, led him in the early years of the present century 
to his "general analysis.” Moore sought to introduce 
an absolutely general independent variable, ranging 
over an abstract space, whereas previously attention 
had been limited to an independent variable ranging 
over ordinary n-dimensional space. He hoped that 
in this way the abstract essence of various current 
theories in analysis might be more clearly revealed. 
Ideas of a somewhat similar type had been proposed 
a little earlier by Maurice Frevhct and also by Erhard 
Schmidt. But cnly Moore saw the full significance 
of general analysis for mathematical thought; and it 
is only in recent years that his ideas are receiving the 
attention which they deserve from mathematicians. 

An early illustration of the wide scope of these 
Moorcnn ideas was furnished by the "recurrent mo¬ 
tions” of dynamical systems first defined ond studied 
by the writer in 1910, shortly after the completion of 
his graduate studies at Chicago. The possibility of 
making on extension of this theory so as to define 
"recurrent motions" and certain analogous “central 
motions" in the sense of general analysis was an¬ 
nounced by him in his Chicago Colloquium Lectures 
on Dynamical Systems in 1920. 

The principal part of his paper was occupied with 
this abstract phase of dynamics, which has been the 
subject of much recent work by American mathemati¬ 
cians and by the powerful contemporary Russian 
mathematical group. The kind of abstract space, R, 
which it seems best to employ is a compact, metric 
space. Corresponding to the change in “time" I there 
is a steady flow of the space R into itself, each point 
tracing out a "curve of motion" in R. The individual 
points represent "states of motion,” and each curve 
of motion represents a complete motion of the abstract 
dynamical system. Thus there is provided not only 
an abstract space R but a “continuous group”: 

' Summary of a paper presented at a fiftieth anniver¬ 
sary symposium of the University of Chicago, September 
24, 1941. 


G: t' = t * c. In other cases this group may be dis¬ 
crete: t' = t*n (n. an integer), or of still more coin- 
plicated form. For a continuous flow in such an 
abstract space R, the recurrent motions Are merely 
those which trace out with uniform closeness in any 
sufficiently large period of their entire history, all 
their states; a periodic motion, represented by a closed 
streamline, affords the simplest illustration of such 
a recurrent motion. The analogous central motions 
are those which recur infinitely often near to any par¬ 
ticular state of the motion, or at least have snch 
motions in the infinitesimal vicinity of any state. 

The first ten of the sixteen problems presented and 
briefly discussed were of this abstract type. 

Problem 1 embodied a conjecture as to the inter¬ 
relationship between continuous and discrete flows in 
such an abstract space R. It is easy to see that this 
relationship must be an intimate one by recalling the 
close connection between an ordinary changing visual 
image of continuous type and the corresponding mov- 
ing-picture image of discrete type. In the abstract 
space R a species of reduction of a continuous flow 
to a discrete flow or at least one of “extenaibly dis¬ 
crete" type may be effected by a process of sectioning, 
first employed by Whitney in a local manner. It was 
conjectured that conversely any such extcnsibly dis¬ 
crete flow may be imbedded in an ordinary continuous 
flow. Ambrose and Kakutani have recently obtained 
interesting results lying in the same general direction 
as this first problem. 

In problems 2 and 3 it was conjectured that all the 
motions of a continuous flow will be recurrent if and 
only if the flow may be decomposed into a set of 
irreducible constituent flows which are "homogeneous" 
(i.e., such that the stream lines arc topologically in- 
distinguishable from one another). Thus the familiar 
two-body problem for a sufficiently small value of the 
energy constant is of this type, the irreducible con¬ 
stituents being the individual periodic elliptic motions. 

The flows which arise from ordinary dynamical 
problems are not only continuous but in general are 
"conservative," ».e., leave a volume integral invariant, 
as in the case of the flow of on incompressible fluid. 
This property of conscrvativencss was used about 
seventy-five years ago by Boltzmann and Maxwell in 
the foundation of statistical mechanics. It is easy 
and natural to extend the definition of conservative 
flows to the abstract case. Important studies of ab 
stract conservative flows have been made recently by 


710 



2 


Beboutoff, Bogoliuboff, Kryloff, Stepanuff in Russia 
and by Halmoa, Oxtoby, Ulam, von Neumann, Wiener 
and Wintner in this country, among others. The held 
of mathematics devoted to the study of conservative 
flows has risen to the rank of an important branch 
of mathematics, called "crgodic theory.” This theory 
is destined to play a fundamental role in statistical 
mechanics, although as yet its importance for this 
field has not been generally realized by physicists. 

Problem 4 was concerned with such conservative 
abstract flows. Here the interesting conjecture was 
advanced that at least if the abstract flow is so regular 
as to be "geodesic,” then it will be conservative if all 
the motions arc central. The converse fact was essen¬ 
tially established by Poincar* in the third volume of 
his "Mlthodea Nouvelles de la M^canique Celeste.” 

The reasonableness of this conjecture was based 
upon the use of a modified type of “compressibility 
volume” of the kind introduced by E. Hopf, and an 
analysis of recent remarkable results of Oenjoy which 
established the unexpected fact in a simple special 
case that the ultimate behavior of a dynamical system 
may depend on the degree of regularity of the func¬ 
tions which characterize it 

In problem 5 it was likewise conjectured that the 
recurrent motions are necessarily everywhere densely 
distributed in the space A of a conservative flow. 
Poincarl has made an analogous but stronger con¬ 
jecture in the case of the restricted problem of three 
bodies and of certain analogous problems when R is 
a three-dimensional space, namely, that the periodic 
motions are everywhere dense in the totality of 
motions, but it is known that his conjecture does not 
always hold. Questions of this general type are of 
philosophical interest, since the crude speculation that 
all dynamical systems are periodio or nearly so pre¬ 
sents itself irresistibly to the human mind. 

It was emphasized that from another point of view 
the real significance of the conservativeness of a flow 
is that (almost) all motions have habitual modes of 
behavior in the mean with respect to any measurable 
process. For example, consider the idealized friction¬ 
less motion of a billiard ball on a billiard table which 
has the shape of a convex oval. In any such motion 
the ball will be in the long run a definite proportion 
of the time on any assigned part of the table, will col¬ 
lide with the rim at a certain definite angular rate, etc. 
Problem 6 proposed a topological characterization of 
conservative flows based on this fact, similar to that 
given by Oxtoby and Ulam, in an as yet unpublished 
paper. 

In problem 7 the restriction of continuity upon a 
conservative flow was relaxed, and a characterization 
of the invariants of the flow based on certain "packing 
coefficients” was proposed. A characterization of cer¬ 


tain special types of such flows in terms of their 
"spectra” has been recently obtained by Halinos, von 
Neumann, Wiener and Wintner. 

Up to this point continuous steady flows in R and 
the more special "conservative” typo had alone been 
considered. But the continuity and conservativeness 
combined do not suffice to characterize the flow of 
true dynamical type except in the simplest case of two 
dimensions (» = 2). Hence it is of especial impor¬ 
tance to define abstractly a "dynamical” flow. This 
was attempted by the writer. Roughly speaking, he 
takes Pfaffian systems as the model for his abstract 
definition rather than the more familiar but equivalent 
Hamiltonian systems of classical dynamics. In this 
way his task becomes that of formulating an abstract 
equivalent for the varitional condition, 
b/XX«dx,=0. 

The crucial part of his characterization of a dynami¬ 
cal flow lay in the suitable definition of a line integral 
in any abstract “geodesic space" R. One conspicuous 
advantage of such a characterization of a dynamical 
flow is that the flow in any invariant subspace of R is 
seen at once to be of dynamical type also. 

It should be emphasized that hitherto the question 
of the adequate characterization of a dynamical flow 
beyond the obvious facta of continuity and conserva¬ 
tiveness has been especially baffling. The proposed 
analytic characterization and the conjectured qualita¬ 
tive characterization embodied in problems 8, 9 and 
10 should prove suggestive in this connection. In 
problem 8 it was asserted that a dynamical flow is 
necessarily conservative; in problems 9 and 10 that, 
certain cases aside, not only are the periodic motions 
everywhere dense but the stable periodic motions are 
everywhere dense and dense on themselves. Here a 
stable periodic motion was defined purely topologically 
as any periodic motion in whose infinitesimal vicinity 
lie other complete motions. A partial converse is 
known to bold through results obtained by D. C. Lewis 
and the lecturer. 

Problem 11 was of a nature intermediate between 
the case of an abstract space R and a space R„ of ti 
dimensions, and was the only problem not stated in 
complete form. It called for the appropriate generali¬ 
zation to a gas of certain remarkable results for the 
famous three-body problem due to Sundmon, and ex¬ 
tended by the writer and Hinrichsen to n > 3 bodies 
and to a more general law of force than that of 
Newton. 

Problem 12 called for an example to show that in 
the case of a continuous (non-conservative) flow in a 
space R. of n i? 3 dimensions, the ordinal series of 
"wandering motions” leading to the central motions 
need not always terminate in n or fewer steps. 

In problem 13 it was conjectured that essentially 


711 



3 


the only 3 -dimensional discrete flows which are “regu¬ 
lar ' 1 in the sense of KerAjartd are ( 1 ) combined rota¬ 
tions of three circle* into themselves; ( 2 ) combined 
rotations of circle and surface of a 3-sphere into 
themselves; (3) rigid rotation of a 3 dimensional 
hypersphere into itself. 

Problems 14 and 15 were closely related. The first 
of these asserts that a 1-1 direct analytic area-preserv¬ 
ing deformation of the surface of a sphere into itself 
which has two Appoints, and is such that iterates of 
the transformation have no other fixpoints, is a pure 
rotation from a topological point of view. Consider¬ 
able evidence was adduced for this conjecture. The 
second problem embodied an analogous conjecture 
concerning a plane circular ring. 

The last two of the announced problem* (problems 
1 «, 17) will perhaps excite the mod interest, aince 
they embody conjecture* which in a certain sense yield 
a kind of complement to the famous "last geometric 
theorem" of Poincar*, announced as probably true by 
Poincar 6 shortly before his death and established sub 
sequently by the lecturer. Suppose that there be given 
a ring-shaped part of the plane bounded by two eon- 
centric circles. Suppose that this ring is deformed 
into itself in any way so that the areas of small figure* 
are conserved, while the points on the two circle* are 
advanced by angular distances a and 0. If a and 0 
are distinct, Poincare’s theorem leads at once to the 
conclusion that there are infinitely many periodic seta 
of pointa under the indefinite repetition of this defor¬ 


mation. But if a and 0 are equal, his theorem is not 
applicable. The conjecture was made that the same 
result (as well as other more specific ones) will hold 
in the case a = 0 , provided that some nearby pointa of 
the ring become separated widely in an angular sense 
by sufficient repetition of the deformation, as clearly 
happens when a and 0 are unequal. This conjecture 
was proved in the very important special case when 
the given conservative deformation can be expressed 
as the product of two involutoric deformations. 

In consequence, for the cl as s ic restricted problem of 
three bodies treated by the American astronomer Q. 
W. Hill, so long as there exist* a "surface of section," 
either there exist infinitely many periodic motions 
(for a given value of the “constant of Jacobi") or 
all possible motions of the "infinitesimal body” (the 
Moon in the Earth, Moon, Sun case) will necessarily 
have the same mean rate of synodical advance of 
perigee about the near by finite body (the Earth), 
per synodical revolution. It was also pointed out how 
the absence of infinitely many periodic orbits would 
indicate that a new qualitative integral exists, in addi¬ 
tion to the usual analytic integral of Jacobi. 

The problems presented and discussed by the writer 
will be likely to receive attentive consideration from 
other mathematicians inasmuch as they embody chal¬ 
lenging conjectures concerning important open ques¬ 
tions in the actively advancing field of theoretical 
dynamic*. 


712 



Reprinted from Amer. Math. Mo., Vol. 49, 1942, pp. 222-226. 


WHAT IS THE ERGODIC THEOREM? 

G. D. BIRKHOFF, Harvard University 

The integral of Lebesgue (1901), founded upon Borel measure, has been a 
dominating weapon in the striking advance of Analysis during the present 
century. Perhaps the Ergodic Theorem (1931) is destined to hold a central posi¬ 
tion in this development. Indeed, Wiener and Wintner in a recent article* refer 
to it as “the only result of real generality established for the solutions of dy¬ 
namical systems.” 

To understand the theorem and the nature of its applications it is necessary 
first of all to say something about (Borcl-Lebesgue) measure, i.e., “probability” 
in the sense sketched by PoincarS in the third volume of his Mtlhodes Nouvelles 
dc la Micanique Celeste. We restrict ourselves to the case of a line segment of 
unit length with coordinate x, 0 ^ 1. Suppose that we have a set of non-over¬ 

lapping intervals, finite in number and of total length / < 1 in this segment. 
The probability in a certain intuitive sense that a point, taken at random, lies 
in one of these intervals, is /; and the probability that it lies in the comple¬ 
mentary set is of course 1 — /. 

Now suppose that we are given a point set M containing an infinite number 
of points, which can be enclosed within an infinite set of non-overlapping inter¬ 
vals of lengths h, /»,••• of total length. 

/l +/* + /,+ •• • - / < 1. 

Then clearly the probability that a point, taken at random, lies in M, cannot 
exceed /; and the probability that it lies in the complementary set is at least 
1 —If now M is of such a nature that it can be enclosed in an infinite set of 
intervals of total length not exceeding an arbitrarily small quantity «, it is ap¬ 
parent that the probability of a random point falling in M does not exceed 
«, i.e. the probability is 0. Such a set M is said to be of measure 0. 

For instance, the set of rational points x = m/n which is everywhere dense 
on the line segment, is of measure 0. In fact these points maybe arranged in 
order 

o. i.«. t; — 

and the nth one of these points may obviously be enclosed within an interval of 
length «/2\ Since we have 


( ( < 



• On the ergodic dynamics of almost periodic systems, American Journal of Mathematics, vol. 
63,1941. For an introduction to the literature see Eberhard Hopf's “Ergodcntheorie,* Ergebnisse der 
Mathematik und ihrer Grenzgebiete. Berlin, Springer, 1937. Our discussion here deals only with 
the “Ergodic Theorem," and not at all with the “Mean Ergodic Theorem" of von Neumann, which 
stimulated me to reconsider some old ideas, and so led me to the discovery and proof of the Ergodic 
Theorem, embodying a strong, precise result which, so far as I know, had never been hoped for. 


713 



WHAT IS THE ERGODIC THEOREM? 


223 


it is evident that this set of rational points is of measure 0. 

M'ore generally, if we have a set M such that it can be enclosed within a set 
of intervals of length h, /»,••• with 

li + 1 1 + • • • ^ l + « 

while the complementary set JI can be enclosed similarly within intervals 
h, It, • • • with 

K + U + • • • £ (1 - 0 + € 

for «>0 arbitrarily small, then JI is said to be measureable of measure l; and 
its complementary set M will then clearly be measurable of measure 1 —l. In 
this case the probability that a random point falls in M is obviously to be re¬ 
garded as /. 

All ordinary infinite sets specifically defined by analytic methods are found 
to be measureable in this sense. 

The gist of the Ergodic Theorem can now be illustrated by means of our line 
segment. 

Suppose that there is given any one-to-one measure preserving transforma¬ 
tion T of the line segment into itself; T may have a finite or infinite 

number of discontinuities. A first simple example is the following: Imagine the 
line segment 0£x<l bent into a circle of circumference 1, without any stretch¬ 
ing; the first transformation T is merely a rotation of this circle through a cer¬ 
tain angle a. A second example is the following: The line segment is divided 
into the infinite set of intervals, 

and then the second interval is interchanged with the first, the fourth with the 
third, etc., thus defining the transformation T. In both cases T is evidently of 
the stated type, and measure is preserved. 

The Ergodic Theorem then says: For any such measure-preserving trans¬ 
formation T, and for each individual point P {except possibly an exceptional set of 
measure 0), there is a definite probability that its iterates under T, from P on, 
namely 

P, T(P), r»(i>), ... and P, T^'(P), T~'{P), • • - 
fall in a given measurable set M. 

In other words the proportion of n of these points (beginning with P) which 
lie in the set M tends toward a definite limit as n approaches infinity in 
either direction. 

More generally, a line segment may be replaced by a finite volume M of 
w-dimensions, n > 1, and the points of M may be assigned a variable (integrable) 
positive weight, w(P). The generalized theorem would then assert that the cor¬ 
responding weighted means tend toward a limit n 9 . In the simple special case 
first stated, this weight is 1 for the points of M and 0 for the points not in Af. 


714 



224 


WHAT IS THE ERGODIC THEOREM? [April, 

Or, again, for n>l the discrete transformation T may be replaced by a steady 
measure-preserving flow T t in time t, and the analogous theorem holds. 

To illustrate this last possibility, suppose that in the square 0£x<l* 
OgyCl, the points.move with a uniform velocity in a fixed direction, making 
an angle a with that of the x axis, and leaving the square to return at the 
homologous point (see the adjoining figure). Evidently such a transformation 
Tt is area-preserving. Let now M be any selected measurable part of the square, 
and let P be any point of the square—aside always from a possible exceptional 



set of measure 0. On the basis of the same theorem, there is a definite probability 
in infinite time, 0 or / £ 0 that P, - T,(P) falls within M, and this probability 
is the same in both directions. More generally a weight w(P) may be introduced 
in the case of a “flow” as well as in the discrete case. 

In more analytic garb, the theorem states in the two cases respectively that 
for n —>± <*> ,T—*± oo : 

w(P) + w(T(P)) + • • • w(T*-'(P)) 1 C T 

n T Jo 

The kind of applications to dynamical systems which the Ergodic Theorem 
affords are exceedingly varied and interesting. Take the simple example of an 
idealized convex billiard table on which an idealized billiard ball P moves with 
velocity 1. In the figure let <t> = arc OA, <tn, =arc OAi, l = AP, l* =AA\. We have 
a transformation (0i, </> s ) = T(0, <f>) defined over a rectangle 

0 < 0 < t; 0 £ 4> £ p, (/> = perimeter of table) 

in the 00-plane, associated with the motion. It is not hard to prove that T is 
measure-preserving in the sense that the double integral 



sin 6 
sin 0i 


ddd4> 


has the same value when extended over any measurable part of this rectangle 


715 




1942] 


WHAT IS THE ERGODIC THEOREM? 


225 


as over its image under T; indeed it would be possible to deform the rectangle 
so that, over the new region, ordinary areas are preserved. 

Furthermore it is clear that, if we associate with any “state of motion” of 



the billiard ball, as of P, the three coordinates 0, <f>, l then a steady flow T, is 
defined in the corresponding region of three-dimensional 0tf>/-space: 

0 < d < »; 0 £ <t> < P, QZl£l 0 

in which the following volume integral is preserved: 

/(//£»■ 

Thus the theorem applies to this flow. 

Here are three obvious applications to this simple but typical dyanamical 
problem: 

( 1 ) the average length of n successive chords of the path tends to a definite 
limit, the same whether the time t increases or decreases; 

(2) the average angle 0 at n successive collisions tends to a definite limiting 
value; 

(3) the billiard ball tends in the limit to lie in any assigned area of the table 
a definite proportion of the time. 

There is one especially interesting case, which may in fact be the “general 
case” as far as we know: It may happen that all of the points of our volume be¬ 
have in essentially the same way in the mean (aside always from the excepted 
set of measure 0, of course). If they do not so behave, the underlying space can 


716 


226 


WHAT IS THE ERGODIC THEOREM? 


(April, 


be subdivided into invariant measurable sets; thus for an elliptical table, the 
motions lying wholly in the ring outside a smaller confocal ellipse form such a 
closed invariant set; and this is an integrable problem—a limiting case of 
geodesics on a flattening ellipsoid. 

What the Ergodic Theorem means, roughly speaking, is that for a discrete 
measure-preserving transformation or a measure-preserving flow of a finite 
volume, probabilities and weighted means tend toward limits when we start 
from a definite state P (not belonging to a possible exceptional set of measure 
0), and, furthermore, the limiting value is the same in both directions. 

The Ergodic Theorem applies to manifold deep problems of analysis and 
of applied mathematics—as well to the solar system as to our simple billiard 
ball problem! Thus in G. W. Hill’s celebrated idealization of the earth-sun- 
moon problem (the restricted problem of three bodies) we can at once assert 
(with probability 1) that the moon possesses a true mean angular state of rota¬ 
tion about the earth (measured from the epoch), the same in both directions of 
the time. 


717 



Reprinted from Publicacione del Instituto de Mathematica , Vol. 6, 
Rosario, 1945, pp. 1-14. 


CIERTAS TRANSFORMACIONES EN LA DINAMICA 
SIN ELEMENTOS PERIODICOS 


por 

George D. Birkhoff y Jaime Lefshitz* 


INTRODUCCION 

En el estudio del con junto de movimientos de un sistema 
dinamico de 2 grados de libertad, uno de los mStodos utilizados 
ha sido el de la reduccion del problema al estudio dc la trans- 
formaci6n de una superficie bidimensional sobre si misma, m6- 
todo descubierto y aplicado por Poin car6 y poster ior men te 
utilizado y perfeccionado con exito por Levi-Civita y G. 
D. Birkhoff (**). 

En virtud de los teoremas de existencia, para un conjunto- 
de dos coordenadas de posici6n y las dos velocidades respectivas 
dado en un instante, existe un movimiento unico del sistema di¬ 
namico. Considerando el espacio-fase formado de las cuatro 
variables arriba mencionadas, a todo punto del mismo le corres- 
ponde una 6rbita definida en forma univoca, y a toda orbila 
le corresponde una linea en este espacio cuadridimensional. El 
conjunto de lineas correspondientes a todos los movimientos po- 
sibles para una energia constante dada llena un subespacio tridi¬ 
mensional del espacio-fase. De existir una superficie analitica es¬ 
pecial que esta cortada por las lineas de movimiento en el mismo 
sentido, la llamada «superficie analitica de secci6n», el estudio 
cualitativo del con junto de movimientos se puede reducir al estu- 


(•) Gaggenheim Fellow, Harvard University. 

(••) Para on estudio reciente y detallado v6ase Nouvelles recherches si tr 
lee eyttemcM dynamiqucc, Mem. Pont. Acad. Scient. Novi Lyncaei. Ex Serie 
m, VoL I de G. D. BrnsHorr. 


718 



[4] 


dio de cierta transformaci6n T de esta superficie sobre si misma. 
La transformaci6n T sera P' = T(P), donde P,P' son punlos 
de la superficie de seccion que estan sobre la misma linea de 
movimiento y P* es el primer punto de esta clase posterior a P 
en el sentido progresivo del tiempo. Esta transformacion esta 
definida para todos los puntos de la superficie de seccion, es 
biunivoca y bicontinua, y en los casos ordinarios es aun bianaliti- 
ca. Ademas, en estos casos existe una integral de superficie 
G(C)=f e FdS, M >F >m> 0, que no cambia con la transfor¬ 
macion T , propiedad caracteristica que en 'cierto sentido se puede 
considerar equivalente a la inalterabilidad de las areas. 

Para que el movimiento correspondiente a un punto P inte¬ 
rior de la superficie de seccion sea periodico, es evidenlemente 
necesario y suficiente que la transformacion T, aplicada al pun¬ 
to P un numero finito de veces, de el mismo punto, es decir, que 
T n (P) = P, para cierto n-/= 0. En el caso general habra puntos 
periddicos, resultado de mucha importancia en la dinamica ana- 
lftica. Otro problema muy interesante, el estudio de los movi- 
mientos en la vecindad de un movimiento peri6dico, tambi6n fu6 
resuelto en forma muy completa. 

En el caso particular del problema restringido de los tres 
cuerpos, Poincare ha demostrado que hay una superficie 
de secci6n topol6gicamente equivalente a la region anular plana. 
Posteriormente G. D. Birkhoff demostro que esta trans- 
formaci6n T es producto de dos transformaciones biunivocas. 
R y U, continuas e involutivas de segundo orden, que transfor- 
man las circunferencias de fronteras sobre si mismas, cada una 
de las cuales cambia el orden ciclico de los puntos sobre estas 
circunferencias. 

En el presente trabajo se tratara de encontrar algunas pro- 
piedades de las transformaciones T = RU, en la hipotesis de 
que no haya puntos periodicos. La cuestion ^ Cuales son los 
sistemas dinamicos que tienen un numero finito de movimientos 
periodicos? es muy importante para la dinamica teorica y es- 
trechamente enlazada con tales transformaciones T = RU. 


1. Transformaciones involutixxis de 2° orden. 


I. Si una transformacion T de una circunferencia sobre si 
misma es biunivoca, continua e invierte el orden ciclico de los 


719 



[5] 


puntos, entonces existen dos puntos invariantes distintos P lt P 2 
sobre la circunferencia, que la dividen en dos arcos abiertos 
A lt A 2 , que se intercambian por la transformacion T. 

Si 3- y 9-' son las coordenadas polares de los puntos P y 
P' = T(P) t la funcion 9' = /(9) es continua y mon6tona decre- 
ciente. La funcion cp(9) = /(9) —9 tambien es monotona decrc- 
ciente y en el intervalo 0<9<2 ti disminuye 4n. Entonces la 
curva 9 =/(O') — 9 corta exactamente dos lineas sucesivas de la 
familia <p = 2kn (k-e ntero) en el intervalo 0<9<2 tt. Para los 
dos puntos correspondientes P X ,P 2 se tcndra: 

/(*i) = 2kn — =f($ l ) = * x + 2kn, 

f(*i) ~$ 2 = 2 (k + l)n — 9' 2 =/(<►*) = * 2 + 2(fc + 1)71, 

y por lo tanto el punto con coordenada = /(9j) = & x + 2kn 
coincide con el punto con coordenada 9-,, o sea, es inveriante. Lo 
mismo se puede decir del otro punto. Los arcos A u A 2 en que 
queda dividida la circunferencia se intercambian por la transfor- 
macion T , porque si no se intercambiaran habria un punto Q 
tal que Q y T(Q) estarian en el mismo arco. Pero en este 
caso el orden ciclico de P X QP. Z no se invertiria por la transfor¬ 
macion, lo cual es imposible, quedando probado que T(A X ) = .4 2 
y T(A 2 ) = A i- Es evidente que no puede haber un tercer punto 
invariante. 

II. Si T es una transformacion biunivoca. continua e in- 
volutiva de un circulo C sobre si mismo, que invierte el orden 
ciclico de los puntos sobre la circunferencia, el conjunto S de 
los puntos invariantes es un continuo. 

El conjunto S es cerrado porque la transformacion T es 
continua. Para probar que es conexo, supongamos lo contrario, 

que S = S= S x S 2 , y S l xS 2 — 0 = S x xS 2 , entonces, S 1 = S 1 xS 
= S 1 xS l + S, xS 2 = S x xS l = S 1 , y en la misma forma S 2 = S 2 . 
$i y S 2 siendo conjuntos cerrados sin punto comun se pueden 
separar en el piano mediante una curva cerrada simple. 

Si esta curva esta en el interior del circulo, si C es el 
conjunto de puntos interiores a esta curva, y si Q esta en 
SxC, el subconjunto conexo maximo de CxT(C) que contiene 
Q es una region topologicamente equivalente a un circulo. La 


720 



[ 6 ] 

transformacion T es una involucion de esta region sobre si 
misma que cambia el orden ciclico de los puntos en la frontera. 
Entonces debe haber dos punlos invariantes sobre la frontera; 
pero por otra parte esta frontera esta contenida en la curva que 
separa los puntos invariantes sin tener tales, y en su transformada, 
que evidentemente tampoco los tiene. 

Si la curva cerrada que separa y S 2 corta a la circun- 
ferencia, habra un arco de la misma interior al circulo que 
dividira el circulo en dos partes, cada una de las cuales conten- 
dra puntos invariantes. Aplicando el mismo razonamiento que) 
antes se obtiene que debe exislir una curva sin puntos invariantes 
que divide el circulo en regiones para las cuales T es una involu- 
ci6n, y por el teorema anterior en las dos fronteras de estas re¬ 
giones debera haber dos puntos invariantes, lo cual es imposible. 
Como no puede haber una separacion mediante una curva que 
corte a la circunferencia ni mediante una curva que no la corte, 
el conjunto S es conexo y por lo tanto es un continuo. 

Si tomamos un punto Q perteneciente a S y un mime.ro 
e (e>0), arbitrarios, el subconjunto conexo maximo U cV b (Q)x 
(*)» ( E >*>>0), que contiene Q es una regi6n topolo- 
•gicamente equivalente a un circulo, y la transformacion T sa- 
tisface todas las condiciones del teorema anterior. Entonces el 

conjunto de los puntos invariantes S' del conjunto U es conexo 
en U. Evidentemente se tiene S' = SxU y U cV t (Q), y como 
e es arbitrario, S es localmente conexo en Q. Siendo Q punto 
arbitrario de S, S es un continuo localmente conexo y por lo 
tanto los puntos P l9 P 2 de S se pueden conectar mediante un 
arco simple 7, JcS. J divide al circulo en dos partes C l ,C 2 
tales que todo punto interior de C t se puede conectar con A l 
mediante un arco simple interior al circulo, que no corta J, y lo 
mismo C 2 respecto a A 2 . Si P pertenece a entonces T(P) 
pertenece a C 2 , lo que es facil verificar conectando P con un 
punto de A x mediante un arco que no corta J; la transformada 
de este arco tampoco cortara J y, como contiene un punto de 
A 2 , tiene que estar totalmente en C 2 . De alii tambien que no 
puede haber puntos invariantes fuera de J, o sea, que J = S. 


(•) Usamo8 el simbolo V e (Q) para designar el conjunto de puntos inte- 
riores al circulo, cuya distancia al punto Q es menor que e- 


721 



[ 7 ] 

III. Una transformacidn T que satisface las condiciones del 
teorema II, deja invariantes los puntos de un arco simple J que 
conecta los puntos invariantes de la frontera e intercambia las 
regiones C x y C 2 en que este divide al circulo, o lo que es lo 
mismo, cs topologicamente equivalente a una reflexion (*). 

III'. Una transformacion biunivoca, continua e involutiva 
de un anillo sobre si mismo, que transforma las circunferencias 
de frontera sobre si mismas e invierte el orden ciclico de los ele- 
mentos sobre ambas, es topologicamente equivalente a una re¬ 
flexion. 

Para probarlo basta definir una transformacion T' que trans- 
forma las circunferencias concentricas del circulo interior sobre si 
mismas en forma similar a la circunferencia interior de frontera, y 
sea igual a T sobre el anillo. Como T' satisface las condiciones 
del teorema anterior, el conjunto de puntos invariantes en el 
anillo forma dos arcos simples que no se cortan, cada uno de los 
cuales conecta un punto invariante de la circunferencia interior 
con uno de la exterior. Por lo tanto dividen el anillo an dos 
partes que sc intercambian por la transformacidn T. 

2. Propiedades generates de una transformation T = RU. 

Como hemos dicho anteriormente, en el problema rcstrin- 
gido de los tres cuerpos la transformacion de la superficie de 
seccion es igual al producto de dos transformacioncs del lipo 
arriba estudiado y, adcmas tiene una integral invariante de area. 

Vamos a representar una tal transformacion inicial por 
T = RU, R 2 = I = U 2 , donde / es la transformacion identica, 
T, R, U, son transformaciones biunivocas y continuas sobre la 
region anular, R y U son involuciones de segundo orden que 
transforman las circunferencias de frontera sobre si mismas e 
invierten el orden ciclico de los puntos sobre estas. Escribimos 
la integral de area G(C) = f FdS , (M>F>m> 0), tal que 
G[T(C)] = G(C). 

Los con juntos de puntos invariantes bajo R y U se indicaran 
por A y B respectivamente, es decir, 


(•) Ua teorema m&s general fu6 demostrado por S. Eilenbero en Sur let 
transformations ptriodiques de la surface de sphere, Fundamenta Mathemati¬ 
cal, vol. 22 (1934), p6g. 28-41. 


722 



[ 8 ] 


Pen/1 1 -* R(P) =P, A= o + a'(o, a' arcos simples), 

P en B C/(P) = P, B=p + 0'(0,0' » » )- 

Escribiendo T n R = R n , y T n U=U n , se tiene: 

r- 1 = (BU)~i = 1/-1R-1 = t/B, 

R 1 =TR= (RU)R = B(£/B) = BT-i, 

U=UR2= (UR)R = T-iR = R 2 U = R(RV) = RT, 
y R n =TnR=T"*T-iR=T'HiU = U n + 1 . 

Se deduce, por induccion, 

B„ = T n R = RT~ n , (n = 0,±l,±2,...) 

(II) 

U n = B„_ 1 = BT-n+1 = RTT-n - UT-", (n = 0. ± 1. ±2,...) 
Por lo tanto, R n 2 = T n RT n R=T n T~ n RR = I, 

Un* = l. (HI) 

R n U n = T^RTnU = T n T~ n RU = RU = T. 

Es decir, el par de transformaciones ( R n ,U n ), tiene las 
mismas propiedades que el par inicial (B, U) (=(B 0 , U 0 )) y, para 
el estudio de la transformacion T, puede substituirlo. Ademas 
«Q pertenece a T n (A)», es decir «7 ,_n (Q) pertenece a .4», equi- 
vale a 

B[r- , »«?)]=r-"(Q) — T*R(Q) = T-«(Q) +-T*»R(Q)=Q 

o sea, «Q pertenece a T n (A)» equivale a 

Q = R 2n (Q) = U 2n + l (Q), (P) 

y en la misma forma decir que Q pertenece a T*(B) equivale a 

Q=R 2 n-i(Q) = u 2n (Q), 

lo cual da las relaciones entre las curvas invariantes para dife- 
rentes factorizaciones de T (*). 

(•) 8e pueden ver esta a propiedades notando que B i y l7 i siempre se pue- 
den presentar como T—» BT» o como T—*UT». 


723 



[ 9 ] 


3. Las transforrruiciones T = RU sin puntos periddicos. 

IV. En caso de que las transformaciones T,R,U, sean del 
tipo antes estudiado, las siguientes propicdades son eqifivalentes: 

(I) No hay puntos periodicos (puntos tales que T n (P) =P,n~/=: 0); 

(II) Los conjuntos \A n ; B n } no se cortan (/l n = T"(/l), B n = 

^ _ 0 . 

= Tn(B)); (III) Lim——= a, existe y es igual para todos 

los puntos del anillo y es un numero irracional, siendo & y 

las coordenadas polares respectivas de los puntos P y T n (P). 
Demostracion de / —► //. 

Si P pertenece a A n y A mt ( n~/=m ), entonces: 

72n-2m(p) = p2n-2mfl2(P) = T 2 nRT 2 ^R{P) = R 2 n[^2m(P)] 

= R 2n (P)=P; 

en la misma forma, si P pertenece a B n y B m ( n-/-m) t r 2 n- 2 m(p) 
= P» y si pertenece a A n y B m , 

7 , 2 n- 2 m+l(p) — p 2 n- 2 m+ip 2 (p) = T 2n RT 2m ~ l R{P) = 

^2n[^2m-l(^ > )] = ^2n(^) = P t 

de donde (/)—►(//). 

Demostracion de II —♦ III. 

Si representamos la region anular antes definida sobre una 
faja infinita en forma tal que las coordenadas cartesianas co- 
rrespondan a las coordenadas polares fry r respectivamente, a 
cada curva sobre el anillo le correspondera una familia de cur- 
vas desplazadas en 2 n, una respecto de la anterior. Si F'=rF , 
se tiene la relacion G'(C')=G(C), donde G es la region co- 

rrespondiente a C (tomada una sola vez), y G'(C') = f F'dS. Sin 

c 

perder la generalidad se puede suponer que G(C) sobre todo el 
anillo es igual a 2n. El valor de G'(C) sobre el area entre a n y 
a n+i es independiente de n, y se designara con el simbolo o. Es 
facil ver que, definiendo en forma conveniente lo que es el area 
dentro de un contorno no necesariamente simple, la integral G(C) 
sobre el area entre un arco simple cualquiera que conecta las 


724 



[ 10 ] 


circunferencias de frontera y su imagen por la transformacion T 
tambien ser& cj. 

<3 a m 0 

Si la relacion—es racional, se tiene —— = —o sea n<5=Zmn. 
7t 2 ti n 

Es decir, la integral G'(C) sobre el area entre a, y a Hn es igual 
a la integral sobre el area entre la primera imagen de a, y la 
imagen de orden (m-fl) de la misma. Entonces la primera 
imagen de a Hn debera coincidir con la imagen de orden 
(m-f-1) de a- o cortarla en un punto interior, y por lo 
tanto a i y o l+n tendran un punto comun sobre el anillo. De allf 

que si las curva§ \A n ‘, B n } no se cortan, — ha de ser un numero 

irracional. En este supuesto al ser m<^-<(m + l), la imagen 

de a n cstara entre las imagenes de orden (m-fl) y (m -f 2) de a 0 . 
Como todo punto P del anillo tiene su imagen entre las dos pri- 
meras imagenes de a 0 , la imagen correspondiente de T n (P) estara 
entre las imagenes de orden (m-fl) y (m-f 3) de a 0 . Si K' cs 
la extension maxima de la regi6n entre las dos primeras image¬ 
nes en el sentido de la coordenada y 0-, son las coordenadas 
respectivas de P y T n (P), se tiene: 

2mn — K' <§ n — § < 2(m + l)nf K'. 

Por otra parte: 

2(m -f 1) n > no > 2mn, 

de donde 

- 2ti -K' < (fr n - &) - no;<2n + K\ 

o sea 

| (fr n - fr) - no | < K = K' + 2n . (IV). 


^^ 

Evidentemente se tiene lim—- 


y por lo tanto (//)—►(///). 


n 


= o para todo punto del anillo. 


Demostracion de III —► I. 

Para todo punto periodico el limite anterior es racional 
respecto a ti por lo cual (///)—*(/). 

725 



[ 11 ] 


Se tiene (/) — (//) — (III) — (/), lo que prueba que 

(/)«-.(//) «-*(///). 

IV'. Las proposiciones: (/'). No hay puntos periodicos 
interiores; (//'). Las curvas { a n } no se intersecan en puntos 
interiores; (///'). Existe un «angulo de rotacion » c, irracio- 
nal respecto a n, tal que la relacion |(& n — fr) _ M |< K t (ft = 
= const., 0<K<cx>), sea cierta para todos los puntos del ani- 
llo, son equivalentes a las proposiciones anteriores. 

En efecto tenemos: 

(/) (/') — (//') (IIP) (///). 

4. Definicidn de una funcidn continue 9(P), tal que 

e[7\P)]=e(P) + c. 

La propiedad de o es semejante a la del angulo de rotation 
para las transformaciones continuas y biunivocas de curvas ce- 
rradas sobre si mismas. La transformacion T transforma las 
circunferencias de frontera sobre si mismas y para ambas el 
Angulo de rotacion sera evidentementie <3. El hecho de que los 
dngulos de rotacion sean iguales sobre estas curvas es evidente 
por el teorema geometrico de Poincare. 

Existe una gran semejanza entre la transformacion T y la 
rotacidn de una circunferencia. En efecto: 

V. Existe una correspondence biunivoca y continua entre 
los puntos de una circunferencia T y ciertos con juntos continuos 
en el anillo, tal que la correspondence se preserva al aplicar una 
rotacion a T y la transformacion T al anillo simultaneamente. 

La correspondence se define poniendo que si P pertenece a 
a rr 9 (P) = no (mod 27i), y mediante el paso al limite. Los ele- 
mentos de {[ aparecen sobre el anillo en el mismo orden que los 
correspondientes de T, ya que las integrales G(C) entre clementos 

{ a n } son iguales a las distancias de los correspondientes so¬ 
bre F. 

El conjunto de los puntos de las curvas \ a n [ es denso sobre el ani¬ 
llo, es decir, el conjunto{a n [ comprende todos Jos puntos del anillo, 
pues de no serlo habria por lo menos un entorno que no cortaria nin- 
guna curva de la familia especificada. La integral G(C) sobre un 
subcon junto cerrado del entorno tendria un valor positivo g . Si 

726 



[ 12 ] 


N> —, por lo menos uno de los valores de la integral G(C) so- 

bre el area entre un par de curvas {a,- } (i = 0, 1, ...» /V), serA 
menor que g. Entonces habra un numero n^N, tal que G(C) 
sobre el area entre a 0 y a n es menor que g. Tomando la suce¬ 
sion { a in }, un numero finito de estas curvas dividen el anillo eu 
areas tales que G(C) sobre estas es menor que g , lo cual implica 
la existencia de curvas a, que cor tan el entorno escogido. Esta con- 
tradicci6n prueba la densidad de este conjunto de curvas. Debido 
a ello la correspondence t> = 0 (P) esta definida sobre todo el 
anillo. 

Si dos sucesiones de puntos {P n \,\P" n \, (P' n ,P" n e JS\ a t -}) 
cepto un numero finito, todos los puntos y de J' tales 

que $' n — >§',§" n —►xK'donde habra dos entornos de los 

puntos distintos entre si, y talcs que fucra de ellos ha¬ 
bra s61o un numero finito de puntos { &' n } y Entonces, ex- 

cepto un numero finito, todos los puntos {P' n } y { P' n J eslaran 
en conjuntos cerrados distintos, y por lo tanto no podran tener 
un punto limite comun. Por eso a un punto P del anillo no lc 
puede corresponder mas de un punto de la circunferencia T, y 
como la correspondencia esta definida para todos los puntos, la 
funci6n 0 (P) es uniforme. Como todo par de puntos para los 
cuales esta funcion tiene valores distintos pueden ser separados 
mediante curvas de la familia { a ,-el valor que se obtiene to¬ 
mando una sucesion convergcnte de puntos P cualesquiera tam- 
bien sera unico, y por lo tanto la funcion 0(P) es continua. 

El conjunto de puntos {P}, tales que 0 (P) = const. es evi- 
dentemente un conjunto cerrado. Si se considera una sucesion de 
curvas {a,- J tales que {$,-}, (&v = 8 ( 0 ,-)) converge hacia un valor 
$ permaneciendo menor que este, el limite de esta sucesion sera un 
conjunto continuo y'(§) cuyos puntos satisfacen la condicion 
Q(P) = §. El conjunto construido en la misma forma 

aproximandose a 0 * por valores mayores tiene las mismas pro- 
piedades. La suma de estos dos conjuntos comprende todos los 
puntos que satisfacen la condicion 0 (P) = fr porque estos limi- 
tes son independientes de las sucesiones escogidas. Estos conjuntos 
deben tener un punto comun, porque, si no hubiera tal, el con¬ 
junto de curvas {a,} no seria denso. Por lo tanto, el conjunto 
T(*)=rW + Y*(*) es un conjunto continuo, que conecta las 
circunferencias de frontera. Es evidente que la familia { y } seria 

727 



[ 13 ] 


la misma si se tomara como base el con junto de curvas ^ <z' f - J , 

o {3',}. La relacion entre esta familia de con juntos y la 
transformacion T es: 7 T [y(^)] = 4-o). 

Considerando un conjunto y($ 0 ) y una constante positiva 
arbitraria e dados, se pueden dar dos arcos a n , a m que encierren a 
y(^o) y laics que G(C) sobre el area entre los mismos sea menor 
que e. Entonces la integral esta perfectamente definida entre dos 
elementos y(& t ) y y($ 2 )» siendo este valor igual a 0 

sea, el valor de la integral entre dos elementos dc{y } es igual a 
la longitud del arco entre los puntos correspondientes sobre F. 

VI. Suponiendo que bava curvas invariantes simples, y 
que no haya puntos periodicos, estas curvas han de ser cerradas, 
no se pueden entrecortar, y separan las circunferencias de fron- 
tera del anillo. 

Si la curva invariante no separara las circunferencias de 
frontera, habria una curva que conectaria estas circunferencias 
sin cortar la curva invariante. Evidentemente el angulo de rota- 
ci6n para los puntos de la curva seria 0 (o un multiplo de 2n), 
lo cual es imposible porque no hay puntos periodicos. Debe ser 
cerrada, porque una curva abierta no divide el piano. 

Para demostrar que no se cortan supongamos lo contrario, 
que dos curvas invariantes J t y J 2 se cortan sin comcidir. 
Habria un punto P de J l que no estaria sobre J 2 , y por lo 
tanto estaria a cierta distancia finita de J 2 . Supongamos que P 
es interior a J 2 . En toda vecindad de P siempre hay puntos 
exteriores a y por lo tanto habra un punto Q interior a J 2 
y exterior a J v Entonces existe un conjunto continuo C', que 
comprende a Q, limitado por arcos de curvas J x y J 2 , tal que 
ningun punto interior de C' este Sobre J x ni J 2 . Como J 1 xJ 2 =/-Q 
y contiene mas de un punto, la frontera del conjunto C' es una 
curva cerrada simple. La integral G(C) sobre esta area es dife- 
rente de cero, y como la integral G(C) sobre el anillo completo 
es finita, para cierto numero finito n, T n (C') y C' han de tener 
puntos comunes interiores, y por lo tanto han de coincidir. 
Pero en este caso el angulo de rotacion o para los puntos del con¬ 
junto C' seria racional respecto a tt, lo cual es imposible. 

5. Propiedades de las transformaciones R y U. 

Las transformaciones R y U no alteran la familia de curvas 

728 



[ 14 ] 


y P or 1° tanto tampoco alteran la familia de cunjuntos 
y} El valor de la integral G(C) entre a 0 y a x es a, y el 
valor de G(C) entre las correspondientes /?(a 0 ) = a o y ^( a i) = a -i 
ser£ — <5, porque es conveniente considerar esta area comonegativa, 
ya que el orden ciclico de los puntos sobre la frontera de 2 $ta 
regi6n se altera por R. La relacion anterior se puede generalizar 
probandola para todo par de curvas {a,} y por el paso al limilc 
para todo par de con juntos En forma explicita se tiene 

G[f?(C)] = — G(C), donde C es un area entre dos conjuntos de 
{y } lo que prueba que la transformacion R es para la circunfo- 
rencia F exactamente una reflexion con respecto al diametro que 
pasa por O. Como T = RU corresponde a una rotacion dc C, 
U ha de corresponder a una reflexion, y (3, j3' ban de corresponder 
a a 

a los puntos — — , n— ~ de i. 


VII. Si J es una curva invariante simple respecto a la trans¬ 
formacion T = RU, y no hay puntos periodicos para la trans- 
formaci6n T , J tambi^n es invariante respecto a R y a U. 

La curva R(J) es invariante para la transformacion T , pues 
por la definicion: 


T(J) =J —► T~ 1 (J)=J. 

->T[R(J)]=R[T-1(J)] = R(J). 

La curva a 0 forzosamente ha dc cortar J . Si P esta contenido 
en a 0 xJ f entonces P perlenece a J y P = R(P) pertenece a 
y P or € * teorema anterior J y R(J) han de coincidir. En 
la misma forma U(J)=J. 

Tanto R como U cambian el orden ciclico de los puntos so¬ 
bre J. De alii se ve que el papel de T, R y (J sobre el area 
entre J y una de las circunferencias de frontera es el mismo 
que sobre el anillo completo. 

En particular la relacion G(R(C)) =—G(C) = G(U(C)) t es 
cierta para toda area abierta limitada por subcon juntos conexos 
{Y}y arcos de curvas invariantes simples(incluyendo las 
circunferencias de frontera) (*). 


(•) J. Lipshitz estA actualmente eatadiando el problema de exiatencia de 
laa curvas invariantes. 


729 



(Reprinted from Buix. Alisa. Math. Society, Vol. 28, No. 4, 

April-May, 1922] 


BOOKS ON RELATIVITY 

Das Relaliiritdlsprivzip. Lorenlz. Einstein. Minkowski. Fortschritte 
der mathematischen Wissenschaften in Monographien. Herausgegeben 
von O. Blumenthal, No. 2. Leipzig und Berlin, B. G. Teubner, dritte 
Auflage, 1920. i + 146 pp. 

Raum. Zeit. Malerie. Von Hermann Weyl. Berlin, Julius Springer, vierte 
Auflage, 1921. Mit 15 Textfiguren. ix + 300 pp. 

Relativity. The special and the general Theory. By Albert Einstein. 
Translated by Robert W. Lawson. New York, Henry Holt and Co., 1921. 
Frontispiece, xiii + 168 pp. 

The Theory of Relativity. By Robert D. Carmichael. Mathematical 
Monographs, Edited by Mansfleld Merriman and Robert S. Wood¬ 
ward, No. 12. New York, John Wiley and Sons, 2nd edition, 1920. 

112 pp. 

Das Relativildtsprinzip. Leichtfasslich entwickelt von Adam Angerbach. 
Leipzig und Berlin, B. G. Teubner, 1920. Mit 9 Figuren im Text. 
57 pp. 

The Concept of Nature. Tamer Lectures delivered in Trinity College, 
November, 1919. By A. N. Whitehead. Cambridge, The University 
Press, 1920. viii + 202 pp. 

Wish unde, Waarheid, Werkelijkheid. Door L. E. J. Brouwer. Groningen, 
P. Noordhoff, 1919. 12 pp. + 23 pp. + 29 pp. 

For scientists generally, and especially for mathematicians and phys¬ 
icists, who understand best many of the questions involved, the theory of 
relativity has fundamental interest. In the following pages our purpose 
is to pass in review the above recent books dealing with the theory and at 
the same time to indicate its present state and some unsolved problems. 

The collection of mouographs gathered by Blumenthal begins with two 
papers by the Dutch physicist, Lorentz, the second and more important 
one of which appeared in 1904. By endeavoring to unite the classical 
Newtonian mechanics and the electromagnetic theory of Faraday and 
Maxwell into a single consistent theory, one is necessarily led to absolute 
space (the ether) and absolute time. In fact, physics has stood committed 
to absolute time since the acceptance of Newton’s law of gravitation. But 
the experiments of Michelson in 1881 yielded an opposing result. Lorentz, 
in common with other physicists, had the conviction that the universe was 
electromagnetic in character, and he turned to the electromagnetic equa¬ 
tions for an explanation of the difficulty. His answer to the apparent 
contradiction of theory and experiment was based upon the fact that the 
equations admitted of a transformation in which space and time were inter¬ 
mingled. . On this basis, without giving up the concepts of absolute space 
and time, he was able to explain the paradox by assuming that bodies 
undergo a slight contraction in the direction of their motion, which for the 


730 



216 


G. D. BIRKHOFF 


QApr.-May, 


earth is not more than a few inches. To an observer moving at uniform 
velocity, the same electromagnetic equations appear to hold because such 
an observer uses “local time.” Lorentz’s explanation violated a funda¬ 
mental principle, namely the pragmatic principle that no physical entity 
exists if its presence can never be determined by any conceivable experi¬ 
ment. Absolute space and time are this type of entity in his theory. 

A year later in 1905, but independently, Einstein wrote his paper Zur 
Eleklrodynamik bewcgter Korper, which is the third paper of the collection. 
In this he lays the foundation of the so-called special theory of relativity. 
Einstein starts with a peculiarly simple type of physical universe, perhaps 
the simplest in harmony with all known physical laws. This is the universe 
of empty isotropic space in which there are infinitesimal inertia particles. 
The particles appear from such a particle to move with uniform velocity 
in a straight line, if observations of light signals are made with the aid of a 
clock. Thus the fundamental measuring instrument is the clock. It is 
further assumed that a light pulse appears to advance at a constant velocity 
from any such particle (the Michelson experiment). On the basis of these 
postulates the transformation equations between the coordinates set up 
from reference particles are deduced, and it is shown that the Maxwell 
electromagnetic equations are unaltered under precisely this group (the 
Lorentz group) of transformations. The behavior of the electron as experi¬ 
mentally determined is in conformity with this theory. In the short 
paper that follows Einstein notes that the same discussion indicates that 
the apparent mass of a system will depend upon its energy. 

The fifth article of the collection is the mathematician Minkowski’s 
remarkable Raum und Zeil of 1908. In this article, which threw a flood 
of light upon the work of Einstein and Lorentz, the geometry of four 
dimensions furnished the principal weapon. If Minkowski had lived, 
doubtless other equally important contributions to the theory of relativity 
would have come from his pen, and in any case it is clear that his influence 
upon Einstein can scarcely be overestimated. 

The gist of Minkowski’s paper is as follows: In the four-dimensiona 
relativistic manifold of space and time, a pair of world-points or events 
are associated with a unique number, namely, if a particle move from the 
earlier world-point to the later world-point, the interval of local time 
elapsed will give this number. The mathematician will realize at once 
that we have here the elements of a non-euclidean geometry of four 
dimensions of simple type. The Lorentz transformations are merely the 
transformations of the geometry which leave this interval between world- 
points unaltered, and the whole theory may be subsumed in the single 
equation 

ds 7 = c 7 dl* - dx 7 — dy 7 - dz 7 , 

where ds is the local time element, c is the velocity of light, and dx, dy, dz, dt 
have their customary meanings. In the same article it is pointed out how 
the laws of motion of the electron may be interpreted in this space; and 
a suitable modification of the Newtonian law of attraction is made, such 
as Poincar^ had given earlier. There follow some instructive notes by the 
mathematical physicist Sommerfeld. 


731 



1922.] 


BOOKS ON RELATIVITY 


217 


The remaining articles of the collection are reprints of more recent 
articles by Einstein, and four of them do not appear in the earlier editions. 
It is Die Grundlagen der cdlgemeinen RelatwU&tstheorie of 1916 which has 
aroused such widespread attention. Concerning it, Sommerfeld says in 
the notes just mentioned: “This general relativity theory is logically so 
unified and satisfactory that it has found unconditional acceptance, espe¬ 
cially in mathematical quarters.” In a paragraph added after learning 
of the verification of Einstein’s quantitative prediction of the deviation of 
light by the sun, Sommerfeld says further, “The general relativity theory 
can therefore be regarded as an established proposition .” If this is the 
truth, physical science is entering upon an era in which the new view will 
differ radically from the classical one. 

To the mathematician, Einstein’s generalized theory is of interest in 
several respects. In the first place it illustrates afresh the importance of 
taking the simplest possible case as an abstract basis of departure. Sec¬ 
ondly, Einstein uses mathematical analogy in passing, step by step, 
from the simple universe of the special theory to the most general universe, 
and at each step the mere sense of mathematical form is sufficient to point 
toward a natural generalization. The mathematician may feel satisfied 
that the formal analogies supplied by classical dynamics and four-dimen¬ 
sional geometry furnish the very basis by which Einstein’s generalization 
proceeds. And, thirdly, the technical tool which made elaboration and 
verification of the theory possible is the invention of the mathematicians 
Riemann, Christoffel, and more especially of Ricci and Levi-Civita—namely 
the absolute differential calculus. 

What then are these successive steps of Einstein? To the writer they 
appear as follows: 

(1) In the special theory of relativity, the universe consists of an empty 
isotropic space with infinitesimal inertia particles, and the central formula 
is that for ds* given above. 

(2) A somewhat more general type of universe is that of a non-isotropic 
empty space formed by a gravitational field. It is natural, by analogy 
with the Riemann geometry, to assume that local time is given by a quad¬ 
ratic differential form ds 7 in the space and time variables, and that the 
particular coordinates chosen are irrelevant (general theory of relativity). 
But in such case only six of the ten coefficients in ds* must be regarded 
as arbitrary. Therefore there are required six equations to fix these coeffi¬ 
cients and these conditions must be independent of the coordinate system. 
This leads to Einstein’s conclusion that the contracted Riemann tensor 
vanishes. By* analogy with the special theory of relativity, the paths of 
the particles appear as the geodesics and the paths of the light pulse 
satisfy the equation ds = 0. 

(3) Still more generality is obtained if matter and energy are present. 
For case (1), this leads to the vanishing of the divergence of an “energy 
tensor.” In the equations obtained m case (2) the left-hand members are 
tensors while the right-hand members vanish. It is natural then to assume 
by analogy with classical dynamics that, in case the space contains matter 
and energy, the right-hand member becomes the energy tensor. If we 


732 



G. D. BIRKHOFF 


218 


[Apr.-May, 


assume this to be the case, the complete equations, as general in their scope 
as those of the classical theory, are obtained. 

The causal principle is effective, but in an obscure form, as follows: 
At any “instant” for the coordinate system under consideration the time 
rate of change of the derivatives of the gravitational tensor formed by the 
system of coefficients in ds 2 and of the energy tensor are known. Also the 
apparent accelerations of the particles are thereby determined. Conse¬ 
quently it is possible to obtain the new value of the tensors and their 
derivatives, and the new positions and velocities of the particles, an instant 
later, and so to proceed indefinitely. 

According to Einstein the acceleration of a particle in empty space is due 
to gravitational forces, and depends only on ds 2 and the coordinates chosen. 
The law of motion at low velocities is found to be the same as that of 
Newton except for very small modifications. To make a specific applica¬ 
tion, Einstein determines the necessary form of ds 2 for a single central 
body such as the sun, also on a postulational basis, and arrives at his 
brilliant predictions. 

In the final paper of the collection, written in 1910, Einstein shows how 
the idea of a spherical space may be conveniently introduced to eliminate 
the difficulties due to boundary conditions. 

The casual reader of the theory of relativity will feel a certain lack of 
concreteness. The classical physical theories seemed to touch reality in at 
least three ways, namely in the independent concepts of space, of time, of 
force. I take it to be self-evident that any genuine physical theory must 
touch reality somewhere. So far as I can see, the Einstein theory does this 
at one and at only one place, namely in its concept of local time which 
can be measured by means of the natural clock, the atom. In an article 
appearing in January, 1921, in the Proceedings of the Berlin Academy, 
Einstein lays emphasis upon this notion of the natural clock as the 
fundamental element, but one could wish that this had been done more 
definitely in his earlier articles. 

From the scientific point of view the most important of the other books 
which we desire to review is Weyl's Raum. Zeit. Matcrie. The Einstein 
theory has a certain pliability in the presence of an energy tensor, which 
may be modified to suit the exigencies of the physical situation under dis¬ 
cussion. On the other hand, this pliability will appear as a defect to some 
minds since it provides physical science with a blank form rather than with 
a definitive theory. It may naturally be expected that theories will be 
forthcoming which attempt to explain non-gravitational phenomena also 
on a similar quasi-geometrical basis. One recalls here the vortex theory 
of the atom as an analogous attempt in classical physics. 

The original part of Weyl’s valuable and complete treatise consists in 
an attempt to deduce the electromagnetic equations in such a manner. 
For this purpose he invents a generalization of the Riemann geometry. 
In the Riemann geometry, the elements ds 3 can be compared at various 
parts of the manifold. Weyl notes that this is a species of action at a 
distance and proposes to compare the elements ds 2 only for various directions 
at a world-point. In other words his quadratic form ds 2 is one in which 


733 



1922.] 


BOOKS ON RELATIVITY 


219 


merely the ratios of the coefficients are important and the coefficients appear 
as undetermined up to an arbitrary multiplicative scale-factor.* 

Thus in the normalization of his quadratic form by change of variables, 
he has five arbitrary functions (the four arbitrary coordinate functions 
and his scale-function) instead of the four functions available in the Einstein 
theory. Weyl is able to use the notion of parallel displacement due to 
Levi-Civita; namely, the small vector can be displaced so as to maintain 
size and direction in a specially chosen geodesic coordinate system. As 
this vector varies in position and returns to its starting point it will not 
have the same length except in the case of the Riemann geometry. The 
logarithmic derivative of the scale-function is a differential, whose four 
components behave as the components of an electromagnetic potential. 

An obvious objection to Weyl’s extension is that he loses contact with 
the real, for ds can no longer stand for the element of local time; otherwise 
we should expect atoms of the same clement with different past histories 
to have different rates of vibration, and such has never been observed to 
be the case. In the second place, a modified ds * for the same manifold 
can be obtained which is invariant as in the Riemann geometry, and 
thus we are led back to the Einstein theory together with a single inde¬ 
pendent equation of the type coming under Einstein’s theory. These 
criticisms of Weyl’s work are given in the March, 192J, number of the 
Proceedings of tue Berlin Academy by Einstein. Eddington has pro¬ 
posed a further modification of Weyl’s theory in the April number of the 
Proceedings of the Royal Society. Eisenhart. and Veblen have gone 
much further in an important paper in the Proceedings of the National 
Academy, February, 1922. 

We pass now to the more popular treatments given in the next three 
books. The first of these is by Einstein himself, and affords an interesting 
and skillful approach to the fundamentals of the theory. 

The second edition of Carmichael’s book contains his earlier treatment 
of the special theory based upon a set of physical postulates. The new 
chapters arc a direct summary of the results of the general theory, as 
presented by Einstein and Eddington. This summary is too abbreviated 
to be followed with much profit by the reader who has not delved elsewhere 
into the theory. 

The tiny pamphlet by Angersbach presents a brief historical develop¬ 
ment of the notions underlying the special relativity together with an 
elementary presentation. It is readable. 

It is obvious that the relativity theory has decided significance for 
philosophical thought, as indeed every new physical theory must have. 
In his interesting book, The Concept of Nature, the English philosopher- 
mathematician Whitehead expounds his views of the physical universe 
in the light of the theory of relativity. The book has obvious relations 
with an earlier book, f The main idea of Whitehead is that the underlying 

•See his recent papers in the Mathematische Zeitschrift, vol. 12, 
Nos. 1, 2 (1922). 

t An Inquiry Concerning the Principles of Natural Knowledge , A. N. 
Whitehead. Cambridge, University Press, 1919. xii + 200 pp. 


734 



220 


G. D. BIRKHOFF 


QApr.-May, 


realities are attained by a method of extensive abstraction, a spatial point 
for instance being generated by all the objects of a certain category (those 
which include it spatially). His analysis of experience is very interesting. 
The mathematician will regret frequently redundancy and vagueness in 
philosophical treatises. Of this there is little in Whitehead's book. The 
importance and the exactitude of many of his analyses must be admitted. 
Maxime Bdcher once said to the writer, “ What man would be a philosopher 
who might be a mathematician!" One feels that Mr. Whitehead deserves 
both titles. 

The contrasting account given in this book between the old theories and 
the new theory of relativity is interesting. Characterizing the old theories, 
Whitehead says: "For example, colour is the result of a transmission from 
the material object to the perceiver’s eye; and what is thus transmitted is 
not colour. Thus colour is not part of the reality of the material object. 
Similarly for the same reason sounds evaporate from nature. Also warmth 
is due to the transfer of something which is not temperature. Thus we 
are left with spatio-temporal positions, and what I may term the ‘pushiness' 
of the body. This leads us to eighteenth and nineteenth century material¬ 
ism, namely, the belief that what is real in nature is matter, in time and 
in space and with inertia." 

The new relativity theory he expounds as follows: "Let us make there¬ 
fore the general statement that four measurements, respectively of inde¬ 
pendent types (such as measurements of lengths in three directions and a 
time) can be found such that a definite event-particle is determined by 
them in its relations to other parts of the manifold. ... If (p,, p t , p a , p 4 ) 
be a set of measurements of this system, then the event-particle which is 
thus determined will be said to have p», p», p,, p 4 as its coordinates in this 
system of measurement." . . . "Then we should naturally say that 
(pi» Pt, P») determined a point in space and that the event particle happened 
at that point at the time p 4 . . . . Furthermore the inhabitant of Mars 
determines event-particles by another system of measurements. Call his 
system the ^-system. According to him, (q », q t , qt, q 4 ) determines an 
event-particle, and (q u q t , q $ ) determines a point and q 4 a time. But the 
collection of event-particles which he thinks of as a point is entirely different 
from any such collection which the man on earth thinks of as a point. Thus 
the ^-space for the man on Mara is quite different from the p-space for the 
land-surveyor on earth. ..." 

". . . We have got to find the way of expressing the field of activity of 
events in the neighborhood of some definite event-particle E of the four- 
dimensional manifold. I bring in a fundamental physical idea which I 
call the ‘impetus' to express this physical field. The event^particle E is 
related to any other neighboring evenb-particle P, by an element of impe¬ 
tus." • • • "Einstein showed how to express the characters of the assemblage 
of elements of impetus of the field surrounding an event-particle E in terms 
of ten quantities which I will call J llf J lt , . . . The numerical values of the 
J’b will depend on the system of measurement adopted, but are so adjusted 
to each particular system that the same value is obtained for the element 
of impetus between E and P, whatever be the system of measurement 


735 



1922.] 


BOOKS OX RELATIVITY 


221 


adopted. This fact is expressed by saying that the ten J’s form a ‘ten¬ 
sor.’ ”... “We now return to the path of the attracted particle. We 
add up all the elements of impetus in the whole path, and obtain thereby 
what I call the ‘integral impetus.* The characteristic of the actual path 
as compared with neighbouring alternative paths is that in the actual paths 
the integral impetus would neither gain nor lose, if the particle wobbled 
out of it into a small extremely near alternative path.” 

Evidently Whitehead is expressing the relativistic theory of the path 
of a particle in a gravitational field. The indefinite general terms used 
stand in unfavorable contrast with those which can be used in the ex¬ 
position of the classical theory. 

Finally we turn to the little pamphlet by Brouwer. Many American 
mathematicians have read in this Bulletin (November, 1913) a transla¬ 
tion of Brouwer’s paper on Intuilionism and Formalism, which is the final 
essay of the pamphlet. Those who know the mathematical work of 
Brouwer will be interested in these essays, in the second of which he touches 
upon the special theory of relativity with emphasis upon the notion of 
group. Attention should also be directed to his noteworthy analysis of 
the logical principle of the excluded middle, to which reference is made in 
the first essay. 

In conclusion, one or two remarks of general character suggest them¬ 
selves. 

The theory of relativity in its general form or in its special form involves 
a definite group. To the mathematician at least it would be of considerable 
interest to see physical theories developed for other special groups. In 
particular the most general group of all, that of analysis situs, suggests 
itself, for this alone appears strictly proper in the general theory of rela¬ 
tivity, where any transformation of coordinates whatsoever ought to be 
admitted. If such a theory can be constructed, the interrelation of con¬ 
tinuous manifolds of arbitrary form will form the essential element. A 
theory of this kind would seem to be consonant with quantum theory. 

Also, with others, we may call attention to the fact that no theory of 
relativity so far explains the difference between positive and negative 
electricity, or throws any light upon the constitution of matter. Further¬ 
more no real reason appears why the velocities of the stars relative to one 
another are so small in comparison with the velocity of light. 

While awaiting further developments, let us at least say with Whitehead 
of Einstein’s investigations: ‘‘They have made us think.” Some may 
agree with the final words of Weyl: “A few chords of that harmony of 
the spheres of which Pythagoras and Kepler dreamed have fallen upon 
our ears.” 

G. D. Birkhoff. 


736 



Reprinted from the Proceedings of the National Academy or Scirncbs. 

Vol. 13. No. 3. March. 1927. 


A THEORY OF MATTER AND ELECTRICITY ' 

By George D. Birkhoff 

Department op Mathematics. Harvard University 
Communicated January 22. 1927 


Up to the present time no mathematical theory of matter and electricity 
seems to have been proposed which meets the fundamental demands of 
determinateness and stability. A theory which appears to satisfy these 
demands is presented herewith. In a second following note it is proved 
that this theory leads to a formula of the Balmer type for the frequencies 
of the small oscillations of a hydrogen atom. 

1. The Perfect Fluid. —We consider first the space-time of the special 
theory of relativity with time coordinate x, and rectangular space co¬ 
ordinates x 2 , x 3 , x 4 , the units of length and time being so chosen that the 
velocity of light is 1. By an “adiabatic fluid” is meant matter whose state 
is determined by a functionally related pressure and density, p and p 
such that if u* = dxj/ds denotes the velocity tensor, and if the “energy 
tensor” is defined to be 


T ij = pttV - pg ij 

(g* = o ,i* j; g" = -g» = -g” = -g 


then the equations of motion are obtained by setting 


ar* 


o. 


i). 


(i) 


( 2 ) 


These equations express the relativistic form of the principles of conserva¬ 
tion of energy (i = J) and of linear momentum in the direction of the three 
axes (i = 2, 3, 4). From them it appears that a certain quantity 


737 



Vol. 13. 1927 


MA THEM A TICS: G. D. BIRKHOFF 


161 


-s — 

Soe > dr ( dr , element of volume) (3) 

has always the same value over any given portion of the fluid; 2 for example, 
if p = p, it is the volume J*dr which is invariable. Under the action 
of an arbitrary body force tensor /*, the only modification necessary is to 
replace the right-hand membei of (2) by f*. It is understood that the only 
forces allowed must be orthogonal tp the velocity tensor, i.e. t f*u a = 0. 

A possible state of equilibrium of such a fluid is that of constant pressure 
and density. If one computes the velocity of a slight disturbance from 
this state, it turns out to b 2 a) Vt where we write a = dp/dp . 3 

Of course, the velocity of such an elastic disturbance cannot exceed that of 
light. On the other hand, if that velocity is less than that of light, diffi¬ 
culties of indeterminateness seem to arise when two portions of the adiabatic 
fluid collide at relative velocities sufficiently near the velocity of light. 
In order to avoid this difficulty of indeterminateness, it seems to be neces¬ 
sary to have a disturbance velocity equal to that of light at all pressures 
and densities. Such a fluid, for which p = l /tp, will be termed the "perfect 
fluid," by analogy with an ordinary perfect gas. 

2. Electricity and the Perfect Fluid .—If F tj is a skew-symmetric tensor 
of the second order so that F/j = — F# then the tensor 

/ = F 4m v m 


yields a force tensor which is in all cases orthogonal to the velocity tensor. 
We will regard Fn, Fu, F u as the components of electric force and — F M 
— F 4 2 , —Fa as the components of magnetic force. The well-known 
Maxwell-Lorentz equations may then be written in the form 


! dF jk | dF>. 

dx k dx, dxy 


0, = — 4 t raUi, 

dx a 


(4) 


where <r is the density of electricity. 

From these it follows that the total charge <r is invariable, so that elec¬ 
tricity may be regarded as a substance. 

Now we suppose that the perfect fluid is permanently charged with 
electricity of which it acts as the carrier. More precisely if we define 
the "electromagnetic energy tensor" as 


E,j = F’F.j + i W* 

4 


(5) 


the ponderomotive equations for combined matter and electricity take a 
form like (2) except that the energy tensor Ty of matter is replaced by Ty 
■+■ F.,j, i.e., by the combined energy tensor of the perfect fluid and of elec¬ 
tricity. 

Since the quantity (3), as well as the electrical charge is invariable it 


738 



162 


MA THEM A TICS: C. D. BIRKHOFF 


Proc. N. A. S. 


follows that these must remain in a constant ratio along any world line of 
the fluid. For the case of the perfect fluid this leads at once to the equation 
( 0 ) 




( 0 ) 


Here <p, y/lr is the constant ratio referred to with the divisor \/lr intro¬ 
duced for the sake of convenience. The quantity tp will be termed the 
"substance coefficient. ” 4 It is an arbitrary function of position, specified 
once for all at the outset and remaining fixed in value along each world line 
of the fluid. 

Since electricity of one sign repels itself, and since the perfect fluid tends 
to expand under the enormous elastic pressures, the charged perfect fluid 
tends to expand all the more with velocities which approach that of light. 
Under such expansion the mass of the relativistic fluid approaches zero 
since the density of matter varies as the square of the density of electricity 
by ((»). 

3. Atomic Potential Energy .—As a first step toward relieving the 
instability just referred to we propose to define an "atomic potential" \p, 
given once for all at each particle of the fluid. In this case along the 
world line of a particle we have 


dt 

ds 


tor¬ 


so that the tensor drp/dx, may function as a second force tensor, being 
orthogonal to the velocity tensor. It is this further body force which we 
shall assume to be present in the fluid. If we define 


A a = Hi} 


(7) 


as the "atomic potential energy tensor," then the laws of motion for the 
perfect fluid under elastic, electrical and atomic potential forces may be 
expressed in the form (2) except that T,j is replaced by the complete energy 


tensor 


H0 - T tj + Ejj + Aij, 


i.e., the equations of motion are 

d// ,a 

tor. 


0 . 


(S) 


It is to be remembered that in addition to these equations there is the 
constitutive equation p = */*P as well as the Maxwell-Lorentz equations 
(4), and the equation 


^L u ° 



(9) 


expressing the fact that \p is constant along each world line. 


739 



Vol. 13. 1927 MATHEMATICS: G. D. BIRKIIOFF 


163 


It would seem to be desirable that ^ be required to vanish along the free 
boundary of any portion of the perfect fluid. Otherwise there will be 
an inward normal pressure equal to tp, and the resultant of the internal 
body forces due to the atomic potential will not vanish. 

4. The Conservation of Energy .—If one transforms properly the 
equation (8) for i = 1 , it leads to a principle of the conservation of energy 
at low velocities in which 

v=f[„-p + ± 

figures as potential energy, so that only half the mass counts an elastic 
energy. The electromagnetic energy density has the usual form. 

The atomic potential energy is proportional to the volume, and in conse¬ 
quence the volume cannot increase indefinitely. Here then is a first factor 
operating in the direction of stability. A portion of the fluid of one sign 
would tend to scatter in wisplike form in space rather than to fill all of space. 

For * = 2, 3, 4 the equation (8) expresses the principles of the conserva¬ 
tion of linear momentum in the direction of the * 3 , x, axes, respectively. 

5. Gravitation. —If now, in accordance with Einstein's general scheme 
for taking care of gravitational phenomena, we write 


(FV + ... + F»*) + * \dr 


1 


( 10 ) 


R.j - = - S»r//y, (11) 

and employ the same laws as before in the sense of the “principle of 
equivalence,’’ an appropriate extension of the above theory of matter and 
electricity to the space-time of the general theory of relativity results. 
As is well known, the relations (S) then appear as implicit in (11). The 
gravitational effect of matter and energy is, however, not exactly propor¬ 
tional to the energy density (see (10)) but is instead given by 


P + (F n * + ... + F„*) + (12) 

oir 

0. The Structure of Matter. Stability. —To obtain a theory of matter 
satisfying the fundamental requirements of determinateness and stability 
we may now proceed as follows. 

Let us suppose that the protons are initially portions of the perfect fluid 
carrying a charge e, and of one or more types of spherical distribution of 
v?, 4a- These protons may collide with one another or with themselves, 
but by definition are not to interpenetrate. 

Similarly, certain other portions of the perfect fluid of one particular 
type carrying a charge — e constitute the electrons. Their mass is much 
less 'and the volume much greater than that of the protons. These 
may collide with one another, but by definition they cannot interpenetrate. 


740 



164 


AfA THEM A TICS: C. D. BIRKIIOFF 


Proc. N. A. S 


If now we were to require that electrons and protons cannot interpene¬ 
trate it is readily proved that they tend to wisplike forms of finite volume 
with nearly neutralizing portions of positive and negative electricity in 
every part. The natural way in which to avoid this difficulty of an amor¬ 
phous stable condition is to allow the free interpenetration of electron and 
proton. It is almost as though the proton were taken as a point charge 
which freely penetrates the electron, although this limiting case introduces 
the vital difficulty of infinite available energy. 

We, therefore, require the free interpenetration of the proton and electron. 
Howevei, there can be no three fluids overlapping at a point, since two of 
the same sign cannot overlap. The appropriate modification in the equa¬ 
tions of motion is easily made. In the Maxwell-Lorentz equations (4) 
the current density tensor <r//, is replaced by o+u, + <r_v, where is the 
density of positive charge with velocity tensor it, while <r_ is the density of 
the negative charge with velocity tensor i\. Similarly the energy tensors 
Tij and A,j are modified, respectively, to 

T,j * = p+ u t uj + p. v,Vj — ( p+ 4- p-)gn, Aij * = 4- 

where the meaning of the notation is obvious. 

The equations (S) express as before the four conservation conditions, 
and follow at once from the equations of motion for the interpenetrating 
fluids. For example, for the proton these are 

or 


t>x a 


= /' 


where /' is the electromagnetic foice tensor for the proton, 

/, = o+F ia u~. 


In this way, even in the space-time of the general theory of relativity, 
the modified equations are at once obtained. 

With such a set of requirements it is not hard to sec that a position of 
stable equilibrium of any system of protons and electrons can be expected 
to appear. In fact none of the protons or electrons can contract indefinitely 
since that would require an infinite elastic potential energy. Nor can they 
expand indefinitely because of the atomic potential energy. And, finally, 
the electrical attractions between the oppositely charged protons and 
electrons, and their free interpenetration, tend to make them coalesce 
into neutral stable forms made up of a number of connected pairs of protons 
and electrons forming the individual atoms. 

1 An outline of this theory was presented in an address entitled "A Mathematical 
Critique of Some Physical Theories,” given before a meeting of Section A of the American 
Association for the Advancement of Science. The American Mathematical Society and 
the Mathematical Association of America on Dec. 30. 1926. The address will appear 
in full in an early number of the Bulletin of the American Mathematical Society. 


741 



Reprinted from Proc. Mat. Acad. Sciences , March, 1927, pp. 165-169. 


TIIE HYDROGEN ATOM AND THE BALMER FORMULA 

By George D. Birkhoff 
Department of Mathematics, Harvard University 
Communicated January 22, 1927 


I propose to apply the theory of the preceding note to the consideration 
of the small oscillations of the proton and single electron forming the 
hydrogen atom. On making certain simple assumptions about the 
substance coefficients and atomic potentials at our disposal, characteristic 
difference frequencies are obtained which are of the same type as those 
given by the Balmer formula. 1 

1. The State of Equilibrium .—As stated at the end of the preceding 
note, we may expect more or less complete coalescence of the electron and 
proton in the spherical condition of equilibrium. Let a = o+ + <r- 2 
stand for the total density of charge. The radial electric force is clearly 

f = ( i) 


and will vanish at the boundary surface r = r 0 since the total charge is zero. 
The conditions for equilibrium ieduce at once to two: 


dr \ 


*4T+ 2 


) 


+ ) + <r+F = 0; 


dr \ 


*>-’*-* 


-) 


-f ) 4- <r- F = 0 


when F has the the value given above. These equations show that the 
radial distributions of electrical densities and substance coefficients in 
equilibrium can be taken at pleasure for both the proton and electron, 
but that then the atomic potentials are uniquely determined. It is to 
be recalled that the arbitrary constants must be so taken that \J/- 
vanish at the boundaries of the proton and electron, respectively. 

It seems clear that the simplest possible assumption to make in the 
choice of the arbitrary densities is that they are constant throughout and 
numerically equal; in this case the electrical force F vanishes everywhere, 
and the conditions for equilibrium reduce to 




2tt 


+ a+. 



V- 2 k 2 

2ir 


+ «- 


( 2 ) 


742 



MA THEM A TICS: G. D. BIRKIIOFF 


Proc. N. A. S. 


where k is the common numerical value of a+ and a-. On account of the 
fact that the atomic potentials and substance coefficients remain constant 
along the world line of any point-particle of the proton or electron, these 
functional relations (2) between and <#>♦-. and <A- and <p- must hold 
identically. It is to be noted that *>+ and still remain undetermined, 
although later a definite choice will be specified. It will be observed 
that thus far we have simply demanded a completely neutral stable state 
of the atom undei a uniform chaige. 

2. The Equations of Oscillation.— Let now X, Y, Z denote the electrical 
forces in the directions of the x, y, z axes, respectively, while L, Af, N 
denote the corresponding magnetic forces. If we eliminate L, AI, A 
in the Maxwell-Lorentz equations we obtain relations of the familiar type 

= a* - 4* - - 4 * o) 

dt 2 dx dt 

™+ ™ ~ ( 4 ) 

dx dy dz 

where A denotes the Laplacian operator and u, v, w denote the current 
densities. These equations hold at velocities small in comparison with 
that of light. Since all the variables vanish in the equilibrium state, they 
will be small quantities for slightly disturbed motions. 

Similarly, if we write down the conditions for the conservation of mo¬ 
mentum we obtain for the proton three equations of the type 

<r+ dt dx \ <r+ p+ / 

where the bars indicate the values of the variables in the equilibrium state 
and where <r+, <p+, f*- refer now to the small changes in these variables. 
But from the first equation (2) we find at once by the method of variation 

<p+k 2 f>¥ 

= - f*- = .=— 

jt *>*• 

where ^+, are defined as stated. Hence the terms in 4'+, p+ of the pre¬ 
ceding equation of motion cancel, and that equation simplifies to 

du+ _ _ d<r+ _ 2 y>+' x<r+ ^ it ^ 


where the accent indicates differentiation with respect to r. By combining 
this equation for the proton with the analogous one for the electron we 
obtain 

^ ^ _ 2 t- + * X. (5) 

dt dx p r p 2 


743 



Vol. 13, 1927 


MA THEM A TICS: G. D. BIRKHOFF 


167 


provided that we assume that <p+ and tp- are in a constant ratio throughout 
the atom in the position of equilibrium. Here <p is defined by the equation 



( 0 ) 


Thus at this stage there remains still to be determined the arbitrary 
function tp and the constant ratio <p+/<p~. We shall assume that the sub¬ 
stance coefficient of the electron is comparatively very small, thereby re¬ 
quiring tp to be very small also. 

Substituting the expression (5) in (3) we obtain 


= AX-¥x + 

bt 2 p 2 tp r 


(7) 


with two similar equations in Y and Z. Here o is defined through (4) 
in terms of X, Y, Z. 

Finally, then, if we replace the variables X, Y, Z by Xe ,pl , Ye ,pt , Ze' pl , 
where p/2ir is the frequency, we obtain the three frequency equations 
valid within the atom, such as 


« + (,,-)x + 2 |.(f + ^ + g)-o. 


(S) 


while the corresponding equations in empty space are of the form 


AA' + p 2 X = 0. 


(■S') 


There are obvious boundary conditions upon the solutions sought such 
as that A', Y, Z are to be finite, and continuous everywhere, and are to 
vanish suitably at infinity; furthermore a is to vanish for r ^ r 0 in empty 
space. 

3. On a Limiting Approximation .—In order to determine the funda¬ 
mental frequencies of the atom, it is necessary now to assume a particular 
form for ^ which we shall define by means of the equation 


_ qJ _ 20V Z + y 

•P 2 r r 2 


(ay > fi 2 ), 


(9) 


where a is to be an exceedingly large positive constant, whose magnitude 
we shall not attempt to determine. 

It is obvious that for a sufficiently large p'/tp becomes, in general, very 
small within the atom. Thus we are led to simplify (8) to the approximate 
form 

AX + [p J — — 1 ] X = 0. (10) 

We propose then to search for the everywhere finite solutions of these equa- 


744 



168 MA THEM A TICS: C. D. BIRKHOFF Pkoc. N. A. b. 

tions (10), (80- Of course, the equations (10) are of the same general 
type as the Schrodinger “wave equation," although the origin and signifi¬ 
cance in his theory are entirely different. 

We propose, furthermore, to treat the radius of the electron as infinite. 
This seemingly paradoxical result may be explained as follows. Let us 
determine the constant 0 by use of Planck's constant as follows. 




Sx 3 me* 

h 3 


where m, e and h have the usual significance in C. G. S. units. This gives 0 7 
of the order 10 ,& . If we take units of length smaller in the ratio 1 to 
0\/a, 0°) takes the form 

AX + (*+ 7 - 7 ,)* “ 0 - (10,) 

Hence if \/0y/~a, light-seconds is small in comparison with the atomic 
radius, the atomic radius will be large for the units employed in (1(V)- 
It will be assumed that the absolute constant 7 is not large. 

4. Determination of Frequencies .—The everywhere finite solutions of 
(10) may be found as follows. Since a typical form of disturbance of the 
atom is that introduced by the removal of a uniform field of force in the 
direction of the z axis for instance, it is natural to write 


x-*f, y = y -F. 

r r 

where the radial electric force F satisfies the equation 

rA 6 F ) +2 t G/) + G* ■ a * + 2J t^ ■ ?*) F=a 

and depends only on the variables r, (r, v>. ^ being spherical coordinates). 
A solution F = i?(r)4>(v>) is single valued in v> only if 4> is a Legendre s 
polynomial of the wth degree in which case R satisfies the equation, 

2 0y/a _ n(n 4- 1) + 7 + 2 \ ^ _ q 
r r 7 / 

Essentially only one solution, R, will be finite at r = 0 for any value of p 
and a, namely, that solution which belongs to the positive exponent u 
given by the characteristic equation 

m (m + 1 ) - ti(w + 1) + 7 + 2. (11) 

If it is also to vanish suitably at r = + co it must have the form 

r*e-*PV) 


R“ + *R' + ( 'p-- <** + 


745 



Vol. 13. 1927 


MA THEM A TICS: G. D. BIRKIIOFF 


169 


where P is a polynomial of degree / (/ = 0, 1, 2, .. .) satisfying the equa¬ 
tion 

rP~ + 2(m + 1 -qr)P' + 2 (J>v^-(#. + D q)P = 0 

of the Laplace type and where q- = a* —/>*. The Ith derivative P, satis¬ 
fies a like equation (obtained by / differentiations) in which the coefficient 
of Pi must vanish. This gives the approximate formula 


(h + ‘/*) , + 7 + 2)* 


0 . 1 , 2 , ...) 


where 7 is small. Hence we see that a high limiting frequency is present 
as suggested by Schrodinger . 3 The only finite corresponding Z is 0. 

On the other hand we may take X = Y = 0 and then determine Z to 
be finite, in which case the frequencies are readily found to be the same 
as before except that 7 + 2 is replaced by 7 - It >s this type of oscillation 
which has physical importance, since the nearby passage of an electron 
would produce it. In this case if 7 is neglected the difference frequencies 
of the Baliner formula appear with 


2ir i me* ( 1 1 \ 

/ 1 3 \( l +/) 2 (1 + !')*/ 


where /, V (/ > /') are any pair of distinct integers. 

f>. Concluding Remarks .—Consider a proton and coincident electron 
at rest according to the theory developed above. If these arc disturbed, 
the electron responds at first as though it were free, precisely because of 
the complete initial neutralization of the fields due to the charges of the 
proton and electron. This is especially true near the center of the electron. 
But elsewhere the electron responds more slowly, of course, and thus a 
vibratory motion of stationary type is set up within the electron. This is 
characterized by the nodal numbers / and n of the above theory. 

The process of dissipation of the energy of such a stationary wave is 
probably somewhat complicated. 

It is desirable to remark that the precise form of the substance coeffi¬ 
cients selected above appears to be only an incidental feature. The es¬ 
sential requirements appeal to be two: (1) complete neutralization in 
equilibrium, i.e., o* + <r- = 0; (2) a combined substance coefficient <p 
in general very small and vanishing at the center of the atom. 


1 The particular specialization made was suggested by the form of the "wave equation" 
used by Schrodinger in his pa|>crs “Quantisierung als Higenwertproblem,” A nnalen der 

Physik. vol 79, 1926. 

1 In general the notation is that of the preceding note unless otherwise stated. The 
subscripts 4- and — refer to the proton and electron, respectively, in all cases. 

• Compare Schrodinger. loc. cit. 1. 


746 



Reprinted from Bull. Amer. Math. Soc., March-April, 1927, Vol. 33, 
pp. 165-181. 

A MATHEMATICAL CRITIQUE OF SOME 
PHYSICAL THEORIES* 

BY G. D. BIRKHOFF 

1. Introduction. My purpose today is to review some of 
the mathematical-physical theories of the past and of the 
present, indicating briefly the nature of certain concepts 
upon which these theories rest as well as the attendant logical 
difficulties, and even proposing modifications which have 
occurred to me. Of course the subject is. too large for an 
address of this kind, but interest in mathematical-physical 
ideas is very widespread, and their importance for both 
mathematicians and physicists is profound. I feel more 
justified in choosing this subject because it has occupied so 
much of my attention recently. 

2. Geometry. It goes without saying that geometry is the 
first and simplest of such theories, unless arithmetic be 
regarded as of this type. But the only assumption of a 
physical type underlying arithmetic is that material objects 
possess a permanent identity; and this can hardly be con¬ 
sidered as an assumption if we are to do any thinking at all. 

Now ordinary three-dimensional geometry of space is 
extremely simple in essence. Some day, when the field of 
knowledge has extended so far that simplification becomes 
necessary, ordinary geometry may be approached somewhat 
as follows: 

(1) Geometry treats of elements called points and the 
relation called distance between pairs of points. 

(2) The complete tabulation of distances between pairs 
of points may be arranged as follows: 

• Presidential Address, delivered before a joint meeting of this Society 
and Section A of the American Association for the Advancement of Science, 
December 30, 1926. This paper was awarded the Prize of the American 
Association for the Advancement of Science, for the 1926 meeting. 


747 



166 


G. D. BIRKHOFF 


[March-April, 


(a) the points P correspond to real number triples 

(*. y. 2 ); 

(b) the squared distance between P 1 and P 2 is 

(x 2 — x x y + ( y 2 — y 1) 2 -I- (z 2 — zi) 2 . 

All of geometry follows very readily from these agree¬ 
ments. To begin with, the line segment P\P 2 may be defined 
as the set of points P such that the sum of the distance P\P 
and the distance PPi is the distance P\P 2 . The definition 
of a line segment makes the definition of a complete line 
possible of course, and then of a plane, while the correspond¬ 
ing simple algebra yields the analytic equations of lines and 
planes. Similarly, the perpendicularity of two lines h and l 2 
with the common point P can be defined as the relation 
existing between h and h when the sum of the squares of the 
two distances from any point on either line to the common 
point is equal to the square of the distance between these two 
points themselves. In this way one may successively define 
line segment, line, plane, perpendicularity, rectangular co¬ 
ordinate systems, etc. The whole body of geometrical fact 
with corresponding analytic framework is thus easily de- 
ducible, and yet one may stop at the fundamental principles, 
without taking up beautiful but less vital geometrical studies. 
More generally, an entirely analogous direct development of 
general differential geometry of n dimensions, based on the 
well known differential formula for the squared element of 
distance, can be made. Indeed much of specialized geometry 
as well as of analysis is likely to give way some day, for as 
my predecessor, Professor Veblen, said in his address: “The 
main current of mathematics will flow by carrying with it 
only the more important facts.” One thing is plain, however: 
it will remain the first duty of the mathematician to develop 
interesting and beautiful theories of all types, without being 
much concerned with the question of ultimate importance. 

The theories of relativity of Einstein have made us 
appreciate that geometry arises directly out of the com¬ 
parison of rigid bodies, and that its meaning becomes more 
and more precise as uniformity of pressure and temperature 


748 



PRESIDENTIAL ADDRESS 


167 


1927 .I 


of the bodies, freedom from gravitational, rotational and 
other stresses are secured; but such precision is limited, 
because of the atomic structure of matter. In its origin the 
geometric concept of space is always to be associated with 
that of a corresponding body of reference. 


3. Classical Dynamics. Classical dynamics arises in the 
attempt to use euclidean space and absolute time as the 
means for expressing the laws of nature. Natural laws are 
then interpreted as describing the interaction of various 
particles, rigid and elastic bodies in a selected space of 
reference. 

There lie at the very basis of this attempt to make space 
the container of matter, certain fundamental difficulties, to 
some of which I wish to direct your attention. The simplest 
illustration of them arises in dealing with a collection of 
"equal rigid elastic spheres,” which move in straight lines 



with uniform velocity except in so far as they may collide. 
The spheres will be taken to be non-rotating and smooth. 
This is the model on which the statistical treatment of the 
properties of a perfect gas is often based. When only two 
spheres collide, the assumed laws of contact action determine 
uniquely their directions and velocities under collision; but 


749 



168 


G. D. BIRKHOFF 


(March-April, 


when more than two spheres collide, the situation is entirely 
different. 

Suppose that three equal spheres approach a point with 
equal velocities, the lines of motion being 120° apart and in 
the same plane. If all three spheres collide at the same 
instant, considerations of symmetry alone demand that the 
spheres must rebound back along the same lines with a 
velocity equal to that of approach. But it is easily verified 
that if two of them collide ever so little before they collide 
with the third, the resultant motion will be decidedly differ¬ 
ent in character, as indicated by the dotted lines of the 
figure. Such a result seems to contradict the fundamental 
physical requirement of continuity. In fact the laws of action 
suffice to determine the behavior of two spheres which collide, 
but as more and more complicated simultaneous collisions 
between three or more spheres are considered, it is not 
possible to infer their behavior by an argument based on 
continuity or even on symmetry. Hence these laws need to 
be supplemented by indefinitely many others of arbitrary 
type, if the mathematical theory is to be determinate. 

s 



The situation is similar with n mass particles, attracting 
one another according to the Newtonian law. The laws of 
motion are then incorporated in 3n differential equations 
of the second order. In the case of only three particles, if 
certain three area integral constants are not all zero, Sund- 
man* has shown that triple collision is impossible, and that 
the differential equations themselves suffice to define the 
motion after double collision; in fact, at double collision the 
bodies approach each other in a certain direction and may 


• Acta Mathematica, 1912. 


750 



PRESIDENTIAL ADDRESS 


169 


>9*7-1 


be taken to recede with nearly the same velocity in .the 
opposite direction after collision; this is proper because for 
nearly a double collision the behavior is almost the same, 
as indicated by the dotted lines in the figure below. Hence 
considerations of continuity determine the behavior at 
collision. 

But, if there are further particles, three particles may 
happen to collide simultaneously, and then there arises the 
same kind of indeterminateness as in the case when three 
rigid elastic spheres collide. Hence, here too, other sup¬ 
plementary conditions are required in addition to those 
furnished by the differential equations. 

Finally, the concept of elastic bodies so fundamental in 
classical dynamics presents even more formidable logical 
objections. The whole of that theory is based on the notion 
of continuously distributed matter subject to certain strains 
and stresses obeying Hooke’s law. It is a characteristic 
feature of the isotropic elastic body that the effects of any 
disturbance are propagated at a definite velocity, namely, 
the disturbance velocity. Bearing this property in mind, let 
us ask what happens when two equal elastic spheres under 
no pressure approach along their line of centers with equal 
velocities which exceed the disturbance velocity. The parts 
of the spheres approaching the plane of collision have no 
possibility of reacting to the disturbance of collision since 
that plane is approaching the center of either sphere at a 
velocity greater than the disturbance velocity. On the other 
hand, the parts of the spheres which collide at the plane 
cannot rebound without interpenetration. Thus it appears 
as if the spheres are converted into a kind of lamina of 
infinite density moving radially outward in the plane of 
symmetry. But this yields a total change of state, which the 
theory of elasticity does not contemplate. 

Another but lesser stumbling block is contained in the 
following situation: Consider two equal elastic hemispheres 
at rest with the two circular boundaries in contact and corn- 


751 



170 


G. D. BIRKHOFF 


(March-April, 


pare these with a single sphere of the same size and also at 
rest. The differential equations, initial densities and pres¬ 
sures are identical and yet the reactions under the same force 
may be different, for in the first case the two hemispheres 
can be separated. 

These illustrations show that the classical theory of 
particles, rigid and elastic bodies, needs to be supplemented 
by a variety of conditions, some of which do not appear to 
have been formulated, and that such formulation will of 
necessity be artificial in character. 

There are other facts to be considered also. Hertz pointed 
out the peculiarities of such cases as an elastic sphere resting 
on a plane surface. Furthermore, if elastic spheres collide 
successively with one another, the internal kinetic energy 
will increase, while the relative velocities of the spheres 
becomes smaller and smaller. Thus a collection of small 
elastic spheres does not really furnish a model for a perfect 
gas. But these are questions of complication rather than of 
principle. 

The central difficulty of indeterminateness disappears in 
the case of a single elastic body, but then the situation is no 
longer analogous to the case of a set of particles in empty 
space, which formed our starting point. The question now 
confronts us: Is it possible to conceive of simple laws of 
motion for systems of particles and for continuous bodies in 
empty space which will be unified and determinate? 

To secure such a system of particles, it suffices to subject 
the particles to forces acting directly between them and of 
the form 



where m and M are the masses of the two particles, and r is 
the distance between them; in other words, in addition to 
the ordinary Newtonian force of attraction there is a re¬ 
pulsive force inversely proportional to the cube of the 
distance. Since the potential energy of the system then 


752 



19*7-] 


PRESIDENTIAL ADDRESS 


171 


increases indefinitely when any two particles approach 
collision, it follows that collision can never take place. 
Furthermore, if there are dissipative forces, there will be 
obviously a position of stable equilibrium for such a set of 
particles. 

In order to deal with a continuous distribution of matter, 
we need merely to set up an appropriate potential energy 
function 

where dm is the element of mass and the integration is taken 
over space. This means that the law of force for the con¬ 
tinuous distribution is the same as that for the system of 
particles just considered; the acceleration in any direction 
of a point of the body is given by the corresponding gradient 
of the potential function U. It is obvious that such a body 
cannot contract indefinitely since then its potential energy 
would exceed the total initial energy; nor can it expand 
indefinitely unless sufficient kinetic energy is available. 
Moreover, there will clearly be a single spherical state of 
equilibrium in case dissipative forces are present. 

This type of body is instructive in another way. It 
illustrates how necessary it is to take the differential equa¬ 
tions of motion as fundamental rather than to rely upon 
what appear to be simple physical intuitions. For, consider 
what happens if two such bodies collide. If we let ordinary 
intuition speak, we might declare that they will either re¬ 
bound from one another or continue indefinitely in contact. 
But the nature of the differential equations of motion ex¬ 
cludes both of these possibilities. In fact, when the bodies 
first touch, the points of contact have differing velocities, 
whereas the accelerations will be the same according to the 
formula. Consequently, the formula shows that they must 
interpenetrate rather than remain in contact or rebound 
from one another. Evidently such a fluid is entirely different 
in character from the elastic body under pressure, but it has 


753 



172 


G. D. BIRKHOFF 


(March-April, 


at least the theoretical advantage of being free from the 
fundamental defect of indeterminateness. As perhaps of 
some interest, it may be remarked that two colliding bodies 
of this description will in general separate after a transitional 
period of interpenetration. 

The chief mathematical instruments used by the physicists 
in dealing with space, time and matter according to the 
conception of classical physics have been the Lagrangian and 
Hamiltonian equations. I wish to digress somewhat from 
my main theme in order to make a remark concerning the 
nature and significance of these equations. 

A dynamical system representable approximately by 
means of asetof “mass particles”subject to”rigid constraints” 
and “conservative forces”of gravitationalor elastic character, 
has its potential energy given by a function of n spatial 
coordinates, while its kinetic energy is expressed in terms of 
these coordinates and their velocities in the usual way. It 
is readily demonstrated that the equations of motion may be 
written in either of the above mentioned forms. The mathe¬ 
matician has thus been led to ask: What are the character¬ 
istics of equations of these types? In the particular case of 
the solar system, it was demonstrated by Laplace and 
Poisson that there would be partial stability. Poincar6 
proved that it was a general characteristic of equations of 
this type to possess complete formal stability. This means 
that the disturbances from periodic motion are essentially 
periodic in type and representable to all orders by means of 
trigonometric series with a certain degree of approximation. 

I have recently succeeded in establishing a kind of converse 
result: If a dynamical system is such that its state is deter¬ 
mined by 2 n coordinates, and if the perturbations from a 
periodic motion can be thus represented by trigonometric 
series, then the equations may be given Hamiltonian form.* 
Thus perhaps the only significance of the Hamiltonian form 
of equations in classical dynamics is to insure automatically 
that periodic motions are completely stable. A more general 

• Comptes Rendus, September 20, 1926. 


754 




PRESIDENTIAL ADDRESS 


173 


• 917.] 


Pfaffian type of equation seems better adapted to be usefu 
to the physicist, for it possesses greater ease and flexibility 
of transformation of the variables. I believe that the whole 
apparatus of transformation which is used in connection wit 
the Hamiltonian equations has been overrated. 

4. The Electromagnetic Theory. The equations of Maxwell, 
giving the interplay between the electric and magnetic forces 
in space, have never been modified, although it has been 
shown that these admit of very elegant formulation when pre¬ 
sented in the Maxwell-Lorentz form. The space-time back¬ 
ground corresponding to this form is that of the special theory 
of relativity, in which space and time are taken relative 
to some reference body and in which the velocity of light 
appears as a characteristic limiting disturbance velocity. 

Fundamental in this theory is the notion of the minute 
test charge of electricity in the field, which is always supposed 
to be associated with ordinary matter. It is the acceleration 
of this electrified mass particle which measures the electric 
and magnetic forces in the field. In recent years statements 
have been made by physicists conjecturing that all mass is 
electromagnetic in origin. Now I have never been able to 
understand such a point of view, for in that case the ultimate 
particle of electricity would be “free" and would constantly 
move in directions determined by the electric and magnetic 
forces with a velocity equal to that of light. 

A first question which arises is this: Shall the forces arising 
from the charge of the particle itself be considered as forming 
part of the total force? If we admit that this can be possible, 
we are confronted with the difficulty that the forces acting 
on the point particle will be infinite; but if we consider the 
resultant electromagnetic forces acting on a hypothetical 
small electrified sphere with the electrified particle at its 
center, the limiting forces can be determined. However, for 
a system of such particles there will be an indefinite radiation 
outward of electromagnetic energy as oppositely electrified 
particles fall into one another, while those similarly electrified 


755 



174 


G. D. BIRKHOFF 


(Mareh-April, 


tend to separate indefinitely under the mutual forces of 
repulsion. Evidently such a system of electrified particles 
is of little physical interest. 

The use of an elastic fluid as the carrier of electricity seems 
at first sight to offer greater prospects of success. Not long 
ago I attempted to develop a theory based upon an adiabatic 
fluid haying the property that the tension increased as the 
square of the density of the electrical charge. This tension 
was introduced in order to offset the electrical forces of 
repulsion within the proton and electron. It turned out that 
there was a unique sphere of equilibrium, of definite radius 
but of unstable type. This sphere contracted under slight 
displacement, and I tried to develop some of the properties 
of what I thought might be the final stable form.* 

Further examination of the problem has made it appear 
impossible to secure stability of the kind desired, no matter 
what type of relation between pressure and density is 
assumed to hold. For, if we imagine the bits of the fluid to 
be widely distributed but under the same pressures and 
densities as at the outset, it is clear that the elastic potential 
energy is the same as before, while the electromagnetic 
energy of the field has been diminished. From this fact and 
some more purely mathematical considerations, it follows 
that there will always be a tendency toward breaking up of 
the fluid into smaller and smaller nuclei so that no final stable 
condition is possible. 

But, aside from this instability which arises from the fact 
that electricity of one sign exerts strong forces of repulsion 
upon itself, there is another difficulty which arises even in 
the consideration of neutral matter not carrying any electri¬ 
cal charge. In fact, we have two types of disturbance 
velocities: firstly, that of light, and secondly, that of the 
elastic wave of the fluid. Now if the velocity of the latter 
wave is less than that of light, we can imagine two bodies 
approaching each other from opposite directions with velo c- 

• The Origin, Nature and Influence of Relativity, New York, The Mac¬ 
millan Company, 1925. Chapters 6 and 7. 


756 




'9*7-1 


PRESIDENTIAL ADDRESS 


175 


ities greater than the disturbance velocity but less than 
that of light. In this case the same paradox would arise as 
in the analogous classical case considered above. 

Therefore the only possible elastic fluid would seem to 
be one with a disturbance velocity equal to that of light at all 
densities. The fluid of this type I shall term the “perfect 
fluid,” by analogy with the ordinary perfect gas. The pressure 
in such a fluid is readily determined to be one half the density 
in absolute units, and so enormously great. Evidently the 
perfect fluid would tend to expand indefinitely with velocities 
approaching that of light. Of course such a perfect fluid 
would afford a much more unstable carrier of electricity than 
the one which I had proposed (loc. cit.). It should be noted 
in passing that because of the relativistic character of this 
perfect fluid, the mass of a small part of it is not invariable 
but changes as the sauare of the density of the attached 
charge. 

Consequently, attempts to make use of an elastic fluid as 
the carrier of electricity seem to fail, and we are led to inquire 
whether it is not in the nature of the case that the elementary 
bodies such as the protons and electrons must be allowed to 
have some sort of autonomous existence in the same sense 
that empty space has. I shall make a suggestion of this sort 
in the “atomic potentials’-* used below. For the moment, 
however, I wish to emphasize the difficulty in maintaining 
the point of view of the “field” consistently. Consider a 
stretched elastic string which is vibrating longitudinally. 
No matter how the linear density or elastic coefficient varies, 
any local disturbance is propagated in both directions along 
the string with the disturbance velocity. Consequently the 
particle cannot be treated as a moving local disturbance. 
However, we might compare the particle to a bead on a 
string and moving with a velocity of its own, whose motion 
is afFected by and affects that of the string. The bead would 
not appear then as a part of the string but would have 
autonomous existence. Here the strings correspond to the 
“field” and the bead to the body. 


757 



176 


G. D. BIRKHOFF 


IMarch-April, 


The idea of atomic potentials may be explained as follows. 

It is well known that the kinetic and elastic energy of an 
adiabatic fluid at low velocities can be defined just as in 
classical dynamics, in such wise that the principle of con¬ 
servation of energy holds. Let us suppose in addition that 
there is an individual atomic potential energy of positive 
volume density. where * has a value fixed for all time 
for each point of the fluid. This leads to a supplementary 
body force, proportional to the gradient of in space. In 
this way indefinite expansion of the corresponding proton or 
electron is prevented, for it would involve an indefinite 
increase in the atomic potential energy. 

At first sight this seems to insure a stable spherical form 
of equilibrium. However, further consideration of small 
motions near the equilibrium state shows that the radial 
forces are so small as to make the nucleus amorphous under 
radial displacement, with a tendency to spread over space 
in wisp like form. 

But now suppose that the p.otons are made of very 
small parts of the fluid with charge +e, while the electrons 
are made of parts of the perfect fluid with the charge 
and with suitable atomic potentials. Let us suppose further¬ 
more that such an electron can be penetrated freely by the 
proton. 

Under these circumstances there will be a stable spher¬ 
ical form of equilibrium in which the proton coincides 
with the electron; for the tendency towards amorphous 
shape of the electrons and protons will be destroyed by the 

attractive forces between them. 

The fact ought not to pass unnoted that all of the forces 
due to the atomic potential will have a resultant zero if the 
density of potential + is constant at the boundary. Hence the 
electron as well as the proton will respond to external forces 
in the required way. 

Here perhaps is a kind of two substance theory of matter 
and electricity which will be found to meet the fundamental 
mathematical requirements of determinateness and stability. 


758 



PRESIDENTIAL ADDRESS 


177 


1 9*7-1 


5. General Relativity. The essential logical structure of 
the gravitational theory of relativity of Einstein is extremely 
simple. Let us suppose that the universe of events is four¬ 
dimensional. Imagine a very slight disturbance of the exist¬ 
ing physical condition to be made at an event. This dis¬ 
turbance will spread and modify physical phenomena in a 
certain region of space-time, in accordance with the principle 
of local causation. This region is analogous to a three- 
dimensional cone with vertex at the given event. Now the 
simplest mathematical form of definition for such a cone is 
that obtained from the vanishing of a differential quadratic 
form ds 2 . Hence it seems probable that such a form ds 2 will 
lie at the base of any physical theory; in consequence, the 
symbolic language of tensor analysis becomes available. If 
we assume also that a mass particle moves in the space-time 
of general relativity according to the same underlying law 
as in the special theory of relativity (principle of equivalence) 
and if we determine the coefficients of ds 2 in such wise as 
to be as nearly as possible the same in character as the 
coefficients in the special theory, they become to a large 


degree determined. 

Now it seems obvious that the space-time in the vicinity 
of the sun possesses spherical symmetry, and the law of 
motion obtained on this assumption is essentially that of 
Newton. In this way the fundamental experimental tests of 


Einstein are obtained. 

Thus the general theory of relativity appears as a theory 
of empty space and throws no light upon the structure of 
matter. It is only the spherical symmetry of the space-time 
about the sun which allows us to come to a conclusion. 

The logical situation is analogous to that involved in the 
following example. Suppose that a uniform square plate is 
maintained with two opposite sides at temperature 0°C, and 
the other two sides at temperature 100°C. I say that it is 


possible to prove that the temperature along the diagonals 
is exactly 50°C, without making any other assumption than 
that states of temperature are additive. In fact, imagine 


759 



178 


G. D. BIRKHOFF 


(March-April, 


the square rotated about one diagonal. In the new position 
the sides at temperature 0° take the position of the sides 
previously at temperature 100°, and vice versa. By addition 
of the first and second states, a state is obtained at tempera¬ 
ture 100°C around the four sides, and therefore 100°C within. 
But evidently along the invariant diagonal the temperatures 
of the two states are identical. In consequence the tempera¬ 
ture along this diagonal, and similarly along the other, must 
be 50°C. A like treatment of a rectangular plate would not 
be possible, because of the lesser symmetry. 

In attempting to extend the general theory of relativity 
to space occupied by matter, Einstein was led to “field 
equations” 

Ra-\Rgii = XT iit -8tt), 

where T ti is the “energy tensor” of matter and electricity. 
These equations are supposed to be supplemented by various 
“constitutive equations” such as the Maxwell-Lorentz equa¬ 
tions. But it should be noted that there are in reality two 
types of energy tensors, namely, one for empty space and 
another for the interior of matter; for instance, it is only 
within matter that the velocity tensor is defined. Thus these 
are not really field equations as they stand. 

The space-time framework of general relativity is adapted 
to our concept of atomic potential; for this purpose we define 
Tij as consisting of the usual elastic and electromagnetic 
energy tensor of an atomic potential term ^g»/. The mathe¬ 
matical fact that the divergence of the energy tensor 
vanishes will yield the observed law of motion of the protons 
and electrons. 

6. Atomic Theory. It is evident today that all earlier ideas 
in physics have been statistical in character, and that the 
fundamental laws are those of the atomic domain. The 
kinetic theory of heat illustrates this method of explanation 
of physical fact in the simplest possible way. Naturally the 
attention of physicists everywhere is fixed upon the discovery 


760 



PRESIDENTIAL ADDRESS 


179 


1 9 * 7-1 


of the laws of the atom. The main theoretical attack is 
based upon electrical phenomena in rarefied gases and the 

properties of spectra. • . .. • 

If we grant the four-dimensional nature of space-time, it 

natural to assume to argue by analogy about the atom 
domain. This argument of continuity seems to make it 
imperative that the atom is an oscillating electromagnetic 
system, which radiates or absorbs electromagnetic energy. 

Now the central facts about the atomic oscillator are 
essentially two: first, it acts like a combination of simple 
resonators of perfectly definite frequencies, such as those 
given by the Balmer formula in the case of the hydrogen 
atom; and secondly, these frequencies are excited only by 
means of certain quanta of energy (Planck’s hypothesis). 

Now in my opinion there need be no essential difficulty 
in accounting for this second fact. Imagine a pendulum to 
swing in a viscous medium whose viscosity diminishes rapidly 
as the distance from the position of equilibrium increases. 
With less than a certain initial velocity, the pendulum would 
swing back slowly towards its position of equilibrium, never 
passing it. On the other hand, if the initial velocity is 
sufficient to carry the pendulum beyond the zone of consider¬ 
able viscosity, it will oscillate back and forth, traversing the 
viscous region in damped harmonic motion. Consequently 
we may conceive of the so-called "energy levels” as defining 
the amount of energy necessary to carry the oscillators so 
far from equilibrium that they will move back and forth past 
the position of equilibrium. Thus the first and most funda¬ 
mental task appears to be to find an oscillator possessing 
the desired frequencies. Afterwards one may investigate in 
detail the rate of electromagnetic radiation, which will cor- 
respond to viscosity. 

It seems premature to abandon the attempt to explain 
the facts upon some such basis. Computations of the fre¬ 
quencies should be made for some simple conjectural theories 
of matter and electricity which meet the elementary mathe¬ 
matical demands of determinateness and stability. As far as 


761 



180 


G. D. BIRKHOFF 


IMarch-April, 


I know this has not been done for a single case. The tendency 
of the quantum theory as developed by N. Bohr, Heisenberg, 
Born and others, has been altogether in the direction of 
obtaining a discrete theory of atomic structure. No doubt 
the discovery of a consistent theory of this kind would be 
of the utmost interest for mathematics as well as physics. 
But from my limited mathematical point of view, I can 
discern no kernel of thought in their work which tends toward 
the construction of a logically satisfactory discrete model, 
although this work is obviously of the highest value for 
physics. 

Very recently Schrodinger* has obtained a highly sug¬ 
gestive “wave equation,” so that spectroscopic frequencies 
are treated by him as somewhat analogous to the frequencies 
of vibration of an electrodynamic system. He determines 
a sequence of frequencies and conjectures that the Balmer 
formula, which is obtained by taking the difference of two 
such frequencies, may be a difference frequency in the usual 
physical sense. 

It will be of interest to compare his wave equation with 
that for the small oscillations of the fluid proton and electron 
in the theory outlined above. Now we observe that in this 
theory there figures a “substance coefficient’’ </> whose square 
gives the ratio of the density of matter to the square of the 
electrical density at any point; there is also the “atomic 
potential” Thus there are two functions </> and ^ which may 
be given arbitrarily at the outset, and which need to be 
specified before definite conclusions can be made. I have 
not as yet had the opportunity of determining the boundary 
conditions and the actual frequencies.f The three wave 
equations are of the same general type as the Schrodinger 
equation. 


• Annalen der Physik, 1926. 

t Further development of the theory outlined above will appear 
shortly in the Proceedings of the National Academy. The theory leads 
to a formula of the Balmer type for the frequencies. 


762 




«9 2 “'l 


PRESIDENTIAL ADDRESS 


181 


There is another remark about atomic dynamics that 
seems to me of importance. It is usual to approximate to 
an atomic problem by an ordinary dynamical problem such 
as the two-body problem, and the equations are then of 
Hamiltonian form. It is well known that any state of such a 
system tends to recur. Thus in general the motion will 
not be limited to certain periodic motions as demanded, or 
at least suggested, by the quantum theory. But, for a non- 
Hamiltonian system I have shown that there will exist a 
restricted set of “central motions ’ near which any motion 
will in general be found.* Consequently if the central 
motions are periodic, we may expect a corresponding set 
of sharp frequencies. Thus an approximating set of differ¬ 
ential equations may yield quantum orbits without any 
quantum conditions, provided that the equations are not of 
Hamiltonian type. 

If I have convinced you of the necessity for a careful 
critique of the logical questions inherent in the laws ordinar¬ 
ily given for matter and electricity, I shall feel very well 
satisfied. The mathematician can be of great service in 
analyzing these basic ideas, in forming physical models of 
as simple type as possible which meet the most pressing 
physical requirements, and in making the necessary cal¬ 
culations of oscillation frequencies, etc. The possibilities of 
such models based upon an underlying four-dimensional 
space-time continuum have by no means been exhausted. 

Harvard University. 


• Gottinger Nachrichten, 1926. 


763 


Reprinted from Sir Isaac Newton , 1727-1927. A Bicentenary Evualuati 
of His Work. Baltimore, 1928, pp. 51-64. 


NEWTON’S PHILOSOPHY OF GRAVITATION 
WITH SPECIAL REFERENCE TO MODERN 
RELATIVITY IDEAS 

GEORGE DAVID BIRKHOFF, Ph D., D.Sc. 

Professor of Mathematics, Harvard University 

I T IS said that at the conclusion of a dinner given 
at the home of the English painter Haydon 
nearly a century ago, the poet Keats, raising his 
glass, proposed a toast to the confusion of Newton, 
and that Wordsworth, astonished, asked an ex¬ 
planation. Keats replied that Newton had de¬ 
stroyed the rainbow in reducing it to a prism. 
Today we honor Sir Isaac Newton for the very 
achievements which the poet deplored. 

Following immediately upon the optical dis¬ 
coveries alluded to by Keats, came the great 
achievements of Newton in the theory of gravita¬ 
tion. In order to understand the advance which 
he made, it is necessary to recall briefly what had 
been done before his time, and at what stage his 
contemporaries had arrived, when he announced 
the law of gravitation. 

Up to the time of the ancient Greeks, scientifi¬ 
cally-minded men had accumulated comparatively 
few experimental facts. These lay mainly in the 

[5i] 


764 


newton’s philosophy of gravitation 


fields of elementary optics, mechanics and astron¬ 
omy. Thus Euclid wrote two books on the optical 
properties of light and its reflection from mirrors; 
Archimedes stated correctly the mechanical prin¬ 
ciples of the lever and the equilibrium of floating 
bodies; and Ptolemy wrote an elaborate astronomi¬ 
cal treatise. It was astronomy, however, that was 
pursued with most success. 

For the description of all these facts as well as 
those of every day life, the concepts of space and 
time, which seemed self-evident, were available. 
This concept of space is incorporated in ordinary 
geometry, while the concept of measurable absolute 
time was so immediate that it went unquestioned 
by physicists and philosophers until the discovery 
of the theory of relativity by Einstein. Using 
this apparently inevitable structure of space and 
time, it was possible to formulate the observed 
laws with exactitude. 

As soon as a sufficient number of astronomical 
facts had been obtained, it was perceived that the 
sun, moon and planets were large spherical bodies 
like the earth, and that their motions, and the 
motions of the stars, despite a superficial simplicity 
to the casual observer, were bewilderingly irregular 
in detail. Here was offered a fascinating mystery 
of the heavens which some of the greatest minds 
were bound to attempt to unravel. 

[5*-] 


765 



newton’s philosophy of gravitation 


The idea that the sun might be the central body 
instead of the earth is one of great antiquity. 
Nevertheless scientific thought before Copernicus 
took the earth to be absolutely at rest. The task 
of explaining the motions of the heavenly bodies 
relatively to the earth proved to be exceedingly 
difficult, and gave rise to the concept of motion 
in epicycles. This sufficed for the mere description 
of many of the facts, but in no way unified or 
explained them. 

Early in the sixteenth century Copernicus pub¬ 
lished the theory known by his name, according to 
which the space attached to the sun and fixed 
stars, rather than the space attached to the earth, 
is “at rest” or “absolute.” He effected thereby 
a remarkable simplification in the explanation of 
the observed facts. On the basis of this theory 
Kepler was not only able to discover the laws of 
motion of the planets about the sun, but also came 
to have vague but essentially correct ideas con¬ 
cerning a gravitational force of attraction which 
kept the earth, moon and planets in their orbits. 

Furthermore his great contemporary Galileo, 
who all but invented the telescope, and made 
remarkable discoveries with it, established the fact 
that it was not change of place so much as change 
of velocity which measured force; thus, a body 
not acted upon by forces would move with constant 

[53] 


766 



newton's philosophy of gravitation 


velocity in a straight line. The essential elements 
of modern dynamical law must be attributed to 
Galileo. 

Kepler went so far as to conjecture that the law 
of attraction according to the inverse second power 
of the distance might hold. It was primarily the 
lack of suitable mathematical instruments of 
thought which stood in the way of further de¬ 
velopment. Without the analytic geometry of 
Descartes and the infinitesimal calculus of Newton 
and Leibnitz, the solution of the problem was 
impossible. 

Thus it is not surprising that Halley, Sir Chris¬ 
topher Wren and Hooke in England were consider¬ 
ing the possibility that this Jaw might hold, almost 
simultaneously, with Newton. Halley wrote to 
Newton on June 2.9, 1686, concerning this matter as 
follows: 

"And I know to be true that in January (16) 83/4 
I, having from the consideration of the sesquialter 
proportion of Kepler, concluded that the centripe- 
tall force decreased in the proportion of the squares 
of the distances reciprocally, came one Wednesday 
to town, where I met with Srs. Christ. Wrcnn and 
Mr. Hook, and, falling in discourse about it, Mr. 
Hook affirmed that upon that principle all the laws 
of celestiall motions were to be demonstrated, and 
that he himself had done it. I declared the ill 



newton's philosophy of gravitation 


success of my attempts; and Sr. Christopher, to 
encourage the inquiry, s d that he would give Mr. 
Hook or me two months time to bring him a con¬ 
vincing demonstration thereof, and besides the 
honour, he of us that did it, should have from him 
a present of a book of 40 2 ”. 

However, Newton far surpassed his contem¬ 
poraries in mathematical power as well as in physi¬ 
cal insight. He was the first to triumph over the 
purely mathematical difficulties involved. In my 
opinion it is this which constitutes his greatest 
achievement rather than the first formulation of 
the law of gravitation, or of the laws of motion, 
both known by his name today. 

It is worthy of note that Newton himself held 
this point of view. His able but jealous rival 
Hooke felt that the discovery was his own, merely 
because he had announced the law. But Newton 
considered that there was very little merit in 
Hooke’s unverified conjecture. In the letter to 
his friend Halley, written June 2.0, 1686, to which 
Halley's letter replied, he said sarcastically in 
regard to Hooke’s attitude: 

“Now is not this very fine? Mathematicians 
that find out, settle, and do all the business, must 
content themselves with being nothing but dry 
calculators and drudges; and another that does 
nothing but pretend and grasp at all things must 

[55] 


768 



newton's philosophy of gravitation 

carry away all the invention, as well as of those 
that were to follow him, as of those that went 
before.” 

In his great Priticipia of 1687 Newton took all of 
these questions out of the realm of nebulous specu¬ 
lation and gave them their classical mathematical 
form. This work may justly be regarded as the 
most important single contribution to physics that 
has ever been made. With its publication there 
was begun a period of more than two centuries in 
which it was sought to reduce all physics to 
Newtonian dynamics. In the Priticipia Newton 
not only showed by rigorous mathematical deduc¬ 
tion how the gravitational law of inverse squares 
led to the empirically determined laws of Kepler, 
but he gave a satisfactory dynamical explanation of 
many known facts about the motions of the 
heavenly bodies, the tides, etc. His treatment of 
the irregularities of the moon's motion is exceed¬ 
ingly remarkable. 

The basic elements in Newton’s theory of space, 
time and gravitation are easily stated. The first 
scholium of his Priticipia shows that he adhered to 
the customary notion of absolute space and absolute 
time: 

“I. Absolute, true, and mathematical time, of 
itself, and from its own nature flows equably with¬ 
out regard to anything external, and by another 

[56] 


IfS) 



NEWTON S PHILOSOPHY OF GRAVITATION 


name is called duration; relative, apparent, and 
common time is some sensible and external 
(whether accurate or unequable) measure of dura¬ 
tion by means of motion. 

“II. Absolute space, in its own nature, without 
regard to anything external, remains always similar 
and immovable. Relative space is some movable 
dimension or measure of the absolute space. 

“III. Place is a part of space which a body takes 
up, and is according to the space, either absolute 
or relative. 

“IV. Absolute motion is the translation of a 
body from one absolute place into another; and 
relative motion, the translation from one relative 
place into another.** 

There was then for Newton, as for all his pred¬ 
ecessors, a particular absolute space and absolute 
time which formed the background with reference 
to which physical events were described. For 
the ancients the space attached to the earth 
was absolute; for Copernicus, the space attached 
to the sun and fixed stars; for Newton, the space 
defined by the center of gravity of the solar system. 

What is the role of an absolute space? It must 
be a particular space in terms of which the explana¬ 
tion of physical laws is most simply given. 

But the laws formulated by Newton were in 
reality such as to make no distinction between the 

[ 57 ] 


770 




newton’s philosophy of gravitation 

Space attached to the center of gravity of the solar 
system, and that attached to the center of gravity 
of any other isolated system of bodies. In fact it 
became impossible to distinguish the space which 
he called absolute from any other space moving 
uniformly with respect to it in some fixed direction. 
Thus there is in fact a spatial relativity present in his 
dynamics, which he felt vaguely,-as the following 
statement from the same scholium bears witness: 

“It is indeed a matter of great difficulty to dis¬ 
cover and effectively distinguish the true motions 
of particular bodies from the apparent; because 
the parts of that immovable space in which those 
motions arc performed do by no means come under 
the observation of our senses. Yet the thing is 
not altogether desperate.” 

Apparently Newton conceived of his absolute 
space as filled by an ethereal medium, by the aid of 
which he hoped to be able to determine absolute 
motion. 

As a first approximation to the facts of nature, the 
Newtonian dynamics, with its spatial relativity, is 
likely to stand permanently. It is the simplest 
theory which explains the main facts. The gravi¬ 
tational theory which is the cornerstone of his 
dynamics will stand for the same reason. We 
still teach our students elementary mechanics and 
gravitation on the Newtonian basis. 

[53] 




NEWTON S PHILOSOPHY OF GRAVITATION 


It is desirable to emphasize the simplicity and 
naturalness of the Newtonian law of gravitation, 
as well as the degree of exactitude with which it 
accounts for the observed facts. Once the Coper- 
nican theory is grasped, it is seen that the gravita¬ 
tional forces must act directly between the bodies 
concerned, just as bodies are pulled directly towards 
the earth by its gravitation. Moreover such 
force must diminish as the mutual distance increases, 
but whether inversely as the first power of the 
distance, as the second power, or as some higher 
power is not so plain. However Kepler’s third 
law of motion indicates at once that only the 
second power is admissible; this fact was estab¬ 
lished by Huyghens in 1673. 

It is decidedly interesting to consider the some¬ 
what philosophic principles which led Newton 
and his contemporaries to the proper formula¬ 
tion of the gravitational law. No one has form¬ 
ulated these principles more admirably than 
Newton himself at the beginning of the third 
volume of his Principia , or "rules of reasoning 
in philosophy." These have been summarized 
as follows: 

"Rule I. We are not to assume more causes than 
are sufficient and necessary for the explanation of 
the observed facts. 

Rule II. Hence as far as possible similar effects 

[ 59 ] 


772 



newton’s philosophy of gravitation 

must be assigned to the same causes; ex. gr. y the fall 
of stones in Europe and America. 

Rule III. Properties common to all bodies within 
reach of our experiments are to be assumed as 
pertaining to bodies; ex. gr. y extension. 

Rule IV. Properties in experimental philosophy 
obtained by wide induction are to be regarded as 
accurate, or at least very nearly true, until phe¬ 
nomena or experiments show that they may be 
corrected or arc liable to exceptions.” 

These principles remain as unexceptionable today 
as they were at the time of Newton. Newton's 
scientific procedure was in strict accordance with 
these principles. He marshalled the facts then 
known concerning the phenomenon of gravitation, 
and gave a satisfactory explanation of them. His 
theory was the simplest one available, and any 
more elaborate theory would have been a useless 
and unjustified flight of the imagination. 

What is it then that has forced us to progress 
beyond the Newtonian point of view to the next 
stage in the development of our notions concerning 
space, time and gravitation. 

In answering this question very briefly, it is 
interesting to recall first of all that even in New¬ 
ton’s day a certain amount of criticism was made of 
his law of gravitation because it allowed one 
body to affect another body, however distant, 

[ 60 ] 


773 



NEWTON S PHILOSOPHY OF GRAVITATION 


instantaneously. Action at a distance seemed to 
disturb a good many of the natural philosophers 
of that day, and Leibnitz in particular criticised 
Newton’s theory on that basis. Newton himself 
felt it necessary to offer some justification of his 
law, and at the end of the third volume of the 
Principia will be found some speculations as to 
the possibility of explaining gravitation by means 
of an all pervading ethereal medium. Undoubtedly 
the fact, discovered by Romer in 1675, t ^ lat light 
travels with a large but finite velocity lent force 
to this criticism. But Laplace showed later that if 
gravitational forces did travel with a finite veloc¬ 
ity, the velocity would be at least ten times that 
of light. Hence this objection to the theory 
seemed out of harmony with the experimental 
results. 

The fact that light was propagated at a finite 
velocity was indeed of extraordinary significance 
from the philosophic point of view. It meant that 
events were not seen when they happened. The 
apparent simultaneity of events appeared as an 
illusion. Thus the related notions of simultaneity 
and absolute time no longer could be based upon 
the immediate evidence of the senses. However, 
this fact alone would never have sufficed to lead to 
the modern point of view. The modification has 
been brought about by the steady accumulation of 

[ 61 ] 


774 



newton's philosophy of gravitation 


new experimental results in physics. After the 
work of Faraday and Maxwell the role of electricity 
and magnetism in nature began to appear to be 
more and more fundamental. Not only was light 
discovered to be an electromagnetic manifestation, 
but the atom was found to be governed by electro¬ 
magnetic laws. The dynamical behavior of visible 
bodies began to be regarded as merely the statistical 
result of the elcctrodynamic behavior of their 
atomic constituents. The conjecture inevitably 
arose that the physical universe is fundamentally 
electromagnetic, that the velocity of light is a 
limiting velocity in nature. The Newtonian law 
of gravitation could only be regarded as accurate 
when the bodies concerned were moving at veloc¬ 
ities small compared with that of light. 

One outcome of this modified view of the phys¬ 
ical universe has been the gravitational theory of 
relativity, discovered by Einstein in 1915. In it 
space and time arc taken to be fundamentally con¬ 
ditioned by the presence of matter, and gravitation 
appears as the inevitable consequence of this inter¬ 
connection. If the new theory accounts for some 
of the slight discrepancies of the Newtonian 
theory, and is more sound from a general philo¬ 
sophic point of view, it is at the same time less 
simple, and perhaps less secure of a permanent 
place in physics, since the Newtonian theory will 

[6 2.] 




775 



newton's philosophy of gravitation 


always hold its position as the proper first approxi¬ 
mation. On the other hand, it appears to be cer¬ 
tain that the general influence of the theory of 
relativity will remain, and that the classical view 
of space and time as a final explanation has been 
permanently abandoned. 

It is worthy of note that the theory of relativity 
of Einstein appears as the simplest possible mathe¬ 
matical theory which can be built up consonantly 
with the electromagnetic structure alluded to 
above, and which takes space and time to be 
conditioned by matter. The theory of Einstein 
offers amazing contrasts with that of Newton. 
Space and time are no longer separate, but are 
joined together in a four-dimensional space-time. 
The fundamental elements arc no longer points 
and instants of time, but are events, defined by a 
point-at-an-instant. Absolute time and simul¬ 
taneity no longer exist, but only a local time at 
each particle. 

Like the Newtonian theory, the theory of Ein¬ 
stein is only successful in explaining gravitational 
phenomena. It throws no light on any other part 
of physics. Moreover there is as yet no indication 
as to precisely how the Newtonian mechanics is 
to be explained in terms of a more fundamental 
relativistic mechanics of the atom. 

As a matter of fact, we have now reached a 

[63 ] 


776 



NEWTON S PHILOSOPHY OF GRAVITATION 


stage in which no theories appear to be funda¬ 
mental in physics—it is merely that some are more 
fundamental than others in certain directions. 
Although we have a vague feeling of the unity 
of the physical universe, and are in possession of 
beautiful mathematical abstractions which account 
for numerous phenomena, nevertheless we have 
just begun to discover what is going on. The 
scientist of today, equally as well as Newton, can 
say “I do not know what I may appear to the 
world; but to myself I seem to have been only like 
a boy playing on the sea-shore, and diverting 
myself in now and then finding a smoother pebble 
or a prettier shell than ordinary, while the great 
ocean of truth lay all undiscovered before me.” 


[ 64 ] 



Reprinted from Jahresbericht der Deutschen Mathematiker Vereinig- 
ung, 1929, Vol. 38, Abt. Heft 1/4, p. 1-16. 


Einige Probleme der Dynamik. 

Von George D. Birkhoff in Cambridge (Mass. U.S.A.). 1 ) 

Mit 4 Figuren (m Text. 

Seit Hill und Poincare sucht man die Bewegungen eines gegebenen 
dynamischen Systems in ihren groBen qualitativen Ziigen zu charak- 
terisieren. Dicse letzte Phase der Entwicklung der theoretischen Dyna¬ 
mik ist fur den Mathematiker vom hochsten Interesse. Die Wichtigkeit 
von qualitativen dynamischen Ideen fur die exakten Wissenschaften 
kann kaum zu hoch bewertet werden. Um solche Ideen zu erlautern, 
werde ich einige einfache Beispiele kurz betrachten. Nach dieser Vor- 
bereitung will ich die Aufmerksamkeit auf einige ungeloste dyna- 
mische Probleme lenken. 

Von sehr groBer Wichtigkeit ist zum Beispiel die nicht ganz genau 
zu fassende Idee, daB irgendeine stabile Bewegung eines dynamischen 
Systems entweder periodisch ist oder in der Nahe einer periodischen 
Bewegung verlauft. Um einen sehr einfachen Fall zu behandeln, be¬ 
trachten wir die Bewegung eines Teilchens P auf einer Geraden unter 
dem EinfluB einer Kraft /, die nur von der Lage und Geschwindigkeit 
von P abhangt. 2 ) Wir werden annehmen, daB es nur eine Gleich- 
gewichtslage O auf der Geraden gibt. Wenn also x die Entfernung 0 P 
und t die Zeit bedeuten, haben wir eine Gleichung 



wo / eine gegebene Funktion ist. Weil die gegebene Bewegung stabil ist, 


so haben wir 


I * | £ M, | y | £ M 


fur t 7> o, wo y = dx/dt ist. Aber die obige Gleichung der zweiten 
Ordnung fiihrt sogleich zu den zwei Gleichungen erster Ordnung 


d X 

di = y- 


dy 

dt 


= /(*. y). 


X) Der Aufsatz gibt zwei Vortrige wiedcr. die der Verfasser als Cast der Universitat 
Berlin am 30. 6. und am 3. 7. 28 gehalten hat. 

2) Dieses Beispiel sowie die anderen hier gegebenen sind in meinem Buche ,,Dyna¬ 
mical Systems" (New Vork. 1927) * n ausfQhrlicher Wcise behandelt. Dort finden sich 
auch wcitcre Litcraturangabcn. 

Jahresbericht d. Deutschen Mathem.-Vereinig-ung. XXXVIII. i. Abt. Heft 1.-4 I 


778 



2 


George D. Birkiioff: 


wo x, y als rechtwinklige Koordinaten betrachtet sind. Allen Be- 
wegungen entsprechen dann Kurven, welche die Ebene einmal erfullen. 
Das Gleichgewicht in O entspricht einer Punktkurve in x = y = o, 
die wir auch mit O bezeichnen. 

Bei y > o bewegt sich jeder Punkt langs seiner Kurve nach rechts, 
weil dx/dt > o ist. Bei y < o bewegt jeder Punkt sich nach links. Auf 
der %-Achse bewegt sich jeder Punkt in der Richtung der y-Achse 
oder in der entgegengesetzten Richtung. Die gegebene stabile Bewegung 
entspricht einer Kurve, welche fur l ^ o in einem Quadrat \x \ ^ M , 
\y\^M liegt. 

Wir betrachten erstens die einfachste Moglichkeit, daB der bewegte 
Punkt niemals die x-Achse iiberschreitet. Hier kann nur die Moglich- 





Fig. i a. Fig. i b. 

keit eintreten, daB der Punkt sich der Gleichgewichtslage nahert, wenn 
t uncndlich wachst, wie in der Fig. ia. 

Bei einem oder zwei Obergiingen iiber die Achse ist die Figur wie in 
ib oder ic resp. Bei zwei Cbergiingen miissen die beiden Cbergange 

auf entgegengesetzten Seiten der Gleich¬ 
gewichtslage liegen, weil auf einer und der- 
selben Seite die Richtung immer dieselbe 
ist. Hier haben wir eine Bewegung des 
Teilchens P, welche einmal oder zweimal 
durch die Gleichgewichtslage oszilliert, um 
sich dann dieser Lage zu nahern. Bei drei 
Obergangen A, B.C ist es augenschein- 
lich, daB C nicht nur auf derselben Seite wie A liegt, sondern zwischen 
0 und A. Hier geht das Teilchen dreimal durch die Gleichgewichts¬ 
lage, aber die dritte Schwingung ist geringer als die erste. Kachher 
nahert sich P dieser Lage. Diese Prozesse lassen sich fortsetzen. Wir 
miissen eine endliche Anzahl von immer kleineren Schwingungen haben 
und nachher eine Annaherung an die Gleichgewichtslage, oder unend- 
lich viele Schwingungen. In diesem letzten Fall ist die Bewegung 
entweder genau periodisch, oder die Schwingungen wachsen, um sich 
einer periodischen Bewegung zu nahern, oder sie verringern sich, um 
sich einer solchen periodischen Bewegung zu nahern, oder sie ver- 

779 


Kin- a- 


Einige Problemc der Dynamik 


3 


ringem sich, um sich der Gleichgewichtslage zu nahern. Mindestens 
in diesem speziellen Fall ist die obige allgemeine Idee eines Zusammen- 
hanges zwischen Stabilitat und Periodizitat gerechtfertigt. 

Bei tieferer Betrachtung dieser Idee erscheinen gewisse ,,zentrale 
Bewegungen" sowie ..rekurrente Bewegungen" als die wirklichen Ver- 
allgemeinerungen der periodischen Bewegungen. 

In der klassischen Dynamik sind die Differentialgleichungen meistens 
von der Hamiltonschen oder kanonischen Form. Im einfachsten 
Fall eines Freiheitsgrades sind die zwei Differentialgleichungen 
diese: 

t T ) dp = _dH dq dH 

dt dq • dt dp’ 


wo H (die Energie) eine gegebene analytische Funktion von p und q 
ist. Durch Multiplikation dieser zwei Gleichungen mit dHjdp und 
dH/dq und Addition sehen wir sofort, daB in der %, y-Ebene ein 
Punkt P sich langs einer Kurve H = Konst, bewegt. Daher muB die 
Bewegung entweder instabil sein oder periodisch oder einem Gleich- 
gewichtszustand ( p 0 , q 0 ) sich nahern. Ferner kann man bei Benutzung 
des Integrals H = Konst, diese Differentialgleichungen integrieren. 

Wir wollen nun ein Hamiltonsches System von zwei Freiheits- 
graden betrachten, weil es der einfachste unlosbare Fall ist. Die Glei¬ 


chungen sind 

dp t _ dH dq t = dH 
dt dq 4 * dt - dpi 



wo H eine gegebene analytische Funktion der vier \*ariablen p u q lt 
p 2 , q 2 ist. Mit Hilfe des bekannten Integrals der Energie, H = Konst., 
und der Unabhangigkeit des H von / kann man die Ordnung dieses 
Systems in rein formaler Weise zweimal reduzieren. Diese Reduktion 
ist eine wohlbekannte und hier nicht notig durchzufiihren. Die neuen 
vereinfachten Gleichungen konnen in der Form eines Hamiltonschen 
Systems von einem Freiheitsgrad ausgedruckt werden wie folgt: 

dp d h dq dH 

dr dq * d r “ dp ' 


Hier ist // eine bekannte Funktion der Yariabeln p, q, r. Wir werden 
diese Reduktion spater gebrauchen. 

Wenn p x , q lt p 2 , q 2 die Koordinaten eines Punktes in einem vier- 
dimensionalen Raum bedeuten, so definieren die ursprunglichen vier 
Gleichungen eine bestimmte Stromung einer Fliissigkeit in diesem 
Raum. Die Geschwindigkeitskomponenten sind gerade die vier GroBen 
auf den rechten Seiten der Gleichungen ( 2 ). Jeder Punkt (p x ,q ,, p 2 , q 2 ) 
reprasentiert einen bestimmten Bewegungszustand. Die Stromlinien 

i* 


780 


4 


George D. Birkhoff: 


oder Bevvegungskurven entsprechen alien moglichen Bewegungen des 
Systems. Die Gesamtheit aller moglichen Punkte entspricht der 
,,Zustandsmannigfaltigkeit“ und kann entweder often oder geschlossen 
sein. Zwei solche dynamische Systeme heiBen „aquivalent“, wenn eine 
Punkttransformation existiert, welche die Punkte und die Bevvegungs¬ 
kurven des einen Systems in die Punkte und die Bevvegungskurven des 
anderen iiberfiihrt. 

Das wirkliche Ziel der Dynamik ist, alk Invarianten eines gegebenen 
Systems gegeniiber solchen Transformationen zu bestimmen, so daB es 
moglich wird, die Frage, ob zwei solche Systeme Equivalent sind oder 
nicht, zu beantvvorten. 

Im allgemeinen existieren solche Invarianten nicht im groBen, son- 
dern sie existieren in der Nahe eines Gleichgevvichtszustandes oder 
(etwas allgemeiner gesprochen) einer periodischen Bevvegung. Daher 
vvollen vvir die Nachbarschaft einer geschlossenen Bevvegungskurve 
betrachten, welcher einer solchen periodischen Bevvegung entspricht. 

Wegen unserer Reduktion betrachten vvir nur diejenigen Be- 
vvegungszustande in der Nahe, welche dieselbe Energiekonstante haben 
wie die gegebene periodische Bevvegung. Sie entsprechen einem drei- 
dimensionalen Teil der Zustandsmannigfaltigkeit, welcher einen Torus 
(im topologischen Sinne) bildet. 

Bei geeigneter Wahl der Variablen p, q, r wird die gegebene Be- 
wegungskurve langs der T-Achse in einem rechtwinkligen p, q, r-Raum 
liegen, wo zwei Punkte (p, q, t + 2 n) und (p, q, r) einem und dem- 
selbcn Bewegungszustand entsprechen. Hier wird der Torus zum 
unendlichen Zylinder. 

Sei nun irgendeine Flache S gegeben, welche die geschlossene Be- 
wegungskurve in einem Punkt Q unter einem von Null verschiedenen 
Winkel schneidet. Zum Beispiel ist die Ebene r = o augenscheinlich 
eine Flache dieser Art. Bezeichnet P einen Punkt dieser Flache, und 
verfolgcn vvir im Sinne der zunehmenden Zeit die 'lurch P gehende 
Bevvegungskurve bis an den ersten folgenden Punkt P x , welcher 
auf S liege. Dadurch ist cine Punkttransformation T definiert, von 
irgendeinem P zu seinem entsprechenden P, : P, = T(P). Diese Trans¬ 
formation und auch die inverse Transformation P = T~ 1 (P 1 ) sind 
analytisch, vvenn nur das gegebene Problem und die Schnittflache S 
analytisch sind. Es ist zu bemerken, daB Q selbst ein Fixpunkt der 
Transformation T ist. 

Um ein einfaches Beispiel einer solchen Transformation T zu geben, 
betrachten vvir die Differentialgleichung zvveiter Ordnung 

q" -f k-q —- o 


781 


Einigc Probleme der Dynainik 


5 


oder, was dasselbe ist, die zwei Differentialgleichungen des obigen 
Typus (3) 

t ,—(—:»•+*¥.)• 

Die Losung (p, q), welche fur r = o die Werte ( p 0 , q 0 ) nimmt, wird 
durch die Formel 

p = po cos&t — kq 0 sin At, q = sin At + q 0 cos At 

gegeben. Wenn t von o bis 2 71 zunimmt, so bekommen wir p lt q x \ 

Pi — Po cos 2 kit — kq 0 sin 2 kn, q t = £* sin 2 kn + q 0 cos 2 kn. 

In den Koordinaten (p/Vk, qVk) ist diese Transformation T eine ge- 
wohnliche Rotation um einen Winkel 2An. 

Augenscheinlich ist, daB bei einer anderen Wahl der Koordinaten 
Oder der Schnittflache eine Transformation T definiert wird, welche 
dem T Equivalent ist. In der Tat, bei einer neuen Wahl der Koordi¬ 
naten sind nur die Koordinaten in S geandert. Auch bei neuer Wahl der 
Schnittflache entspricht einem Wertepaar p, q der Variablen in S 
cin und 'nur ein Wertepaar p, q in S, so daB die Transformationen T 
und T auch hier Equivalent sein miissen. 

Dieses Ergebnis laBt sich umkehren; also, wenn die zugehorigen 
Transformationen zweier dynamischen Probleme Equivalent sind, 
so sind diese Probleme einander Equivalent. Um dieses zu beweisen, 
ist es nur notig. eine eindeutige stetige Abbildung der zwei Zustands- 
mannigfaltigkeiten zu definieren. Dies geschieht in geometrischer 
Weise wie folgt: die Punkte von S und S korrespondieren in gegebener 
Weise. Jeder andere Punkt P der ersten Mannigfaltigkeit liegt auf 
einem Bogen QQ lt welcher in den zwei Punkten Q, Q x der Schnitt- 
flEche endet. Der Punkt P teilt die Bogenlange QQ X in zwei Teile QP, 
PQ X . Wir bezeichnen das VerhEltnis QP/PQ X mit cr. Wir lassen zwei 
Punkte P und P korrespondieren, wenn sie auf entsprechenden 
Bogen QQ t und QQ X liegen, mit gleichem a und cr. In dieser Weise ist 
eine stetige Transformation der einen Mannigfaltigkeit in die andere 
definiert, welche die Bewegungskurven der einen in die Bewegungs- 
kurven der anderen iiberfiihrt. 

Hier sind die benutzten Punkttransformationen nur stetig, so daB 
wir bei dieser Beweismethode die Gruppe aller stetigen Punkttrans¬ 
formationen gebrauchen. 

Diese Betrachtungen zeigen, daB alle dynamischen Eigenschaften 
der Bewegung den Eigenschaften dieser Transformationen ent- 


782 




6 


George D. Birkhoff: 


sprechen, so daB das dynamische Problem auf das Problem der Trans¬ 
formation einer Ebene in der Nahe eines Fixpunktes reduziert wird. 
Es ist auch klar, daB eine solche Art der Reduktion eines willkiirlichen 
dynamischen Problems auf ein Transformationsproblem immer mog- 
lich ist, wenigstens in der Nahe einer periodischen Bewegung. 

Welches sind nun die charakteristischen Eigenschaften einer Trans¬ 
formation T, welche zu einem dynamischen Problem (3) gehort ? 

Um diese Frage zu beantworten, beachten wir zuerst, daB, wenn 
wir kanonische Variable p, q, r benutzen und mit der Schnittflache 
r = o operieren, der Flacheninhalt bei T nicht geandert wird. In der 
Tat laBt die Stromung das Volumen invariant, weil die Gleichungen 
in folgender Weise geschrieben werden konnen: 


dp _dH d q d H dr 

dr ~~ dq * dr dp ’ dr 


(H = H {p, q. r)). 


worin 


d/__ cDA d 
dp \ dq) dq 



O. 


Ferner bewegt sich jeder Punkt mit einer Geschwindigkeitskom- 
ponente 1 in der Richtung der r-Achse. Daher muB ein kleiner Zylinder 
mit einer Basis a in r = o. und mit einer konstanten kleinen Hohe h 
immer eine Basis mit demselben Flacheninhalt haben. Irgendein 
Flacheninhalt a in r = o muB daher dem Flacheninhalt a der Flache 
r = 27r gleich sein, w. z. b. w. 

Nun konnen wir ein etwas anderes Bild unseres Problems in einer 
Ebene entwerfen. In einer p, ^-Ebene bewegt sich jeder Punkt bei 
festem r mit Geschwindigkeitskomponenten —dHjdq.dH/dp. So wird 
eine veranderliche Stromung in der Ebene definiert, welche dieselbe 
ist, als wenn die Ebene r = c sich mit der Geschwindigkeit 1 in der Rich¬ 
tung der r-Achse bewegt und jeder Punkt dieser p, ^-Ebene auf seiner 
zugehorigen Bewegungskurve bleibt. Diese veranderliche Stromung 
ist die einer inkompressiblen Fliissigkeit in der Ebene, weil der Flachen¬ 
inhalt irgendeines Teiles konstant bleibt. Wir sehen auch sofort, daB 
die zweidimensionale Stromung periodisch ist, mit einer Periode 2tz. 
Die vorhergehende Transformation T hat hier die folgende Bedeutung: 
ein Punkt Pder Fliissigkeit, welcher fur r = o die Lage ( p 0 , q 0 ) hat, wird 
nach 2 71 Sekunden in (p x , q x ) sein. 

In umgekehrter Weise entspricht jede inhaltserhaltende periodische 
Stromung dieser Art einem dynamischen Problem (3). In der Tat lassen 
die Differentialgleichungen dafiir sich in der Gestalt 

p- = Q, ar = T i p = p V'1- r >' Q = Q<P'1-r)) 


dp = p 

dr 


783 


Einige Probleme der Dynamik 


schreiben, wo p, q periodisch mit der Periode in t sind und wo 

+ 9Q = o 

dp + dq 

der Inkompressibilitat wegen. Wenn wir nun 

(Pi f) 

// =/(£<*#> - Pd?) 

( 0 , 0 ) 

schreiben, nehmen diese Differentialgleichungen gerade die Form (3) an. 

Daher scheint es als augenscheinlich, daB jede eindeutige flachen- 
inhaltserhaltende periodischeTransformation T einem Hamiltonschen 
Problem der Art (3) zugehort. Diese Tatsache laBt sich sofort beweisen, 
wenigstens wenn die Funktionen, welche in T eintreten, sowie alle ihre 
Ableitungen stetig sind und H von derselben Art (aber vielleicht nicht 
analytisch) ist . l ) 

Fur dynamische Systeme mehrerer Freiheitsgrade ist die entspre- 
chende rauminhaltserhaltende Eigenschaft solcher Transformationen T 
nicht ganz charakteristisch. 

Betrachten wir jetzt irgendeine flacheninhaltserhaltende Transfor¬ 
mation T in der Nahe des Fixpunktes. Im allgemeinen gehdren sie 
einer der folgenden Arten an. Entweder der lineare Teil von T ist eine 
Rotation 

Pi = Po cos & — q 0 sin d, q x = p 0 sin ^ + q 0 cos 0 
mit einer nichtrationalen Zahl oder er hat die Form 

Pi = *Po. h = \ q„ 

mit A 2 -4-1. Der zweite, viel einfachere Typus mag instabil heiBen, der 
erste mag stabil heiBen. 

Im zweiten Fall scheint es sicher zu sein, daB bei passender Wahl 
der p, q die Transformation T im allgemeinen die hier folgende normale 
Form annimmt: 

Pi = **'<'P». ?i = l 

Hierbei sind alle Punkttransformationen gestattet, worin alle die 
zugehorigen Funktionen sowie ihre Ableitungen stetig sind. Fur das 
urspriingliche dynamische Problem, p 0 = o und q 0 = o, entsprechen 
sich zwei Familien von asymptotischen Bewegungen. Alle anderen 
Bewegungen in der Nahe nahern sich nur der periodischen Bewegung, 
um sich spater zu entfernen. 

1) Siehe meine Note ,,A Remark on the Role of Poincare's geometric Theorem". 
(Szeged Acta, 1928.) 


784 



8 


George L>. Birkiioff: 


Im ersten Fall gibt es auch eine cinfachc normale Form 
p x = p 0 cos (tf + r 0 2 ) “ 9o sin (0 -f r 2 ). 
q x = p 0 Mil (* + r <f) + 9 o ™s (* + V). 
w0 r 0 2 = /> 0 2 + f 0 * welche im allgemcinen nurin formaler Weisc crreich- 

bar ist. . . 

Wir sehcn, dab auch hier nur eine formale Invariante 0 existiert. Die 
Transformation ist mit groBer Genauigkeit zu betrachten als ahnlich 
einer Rotation urn (o,o) clurch einen Winkel 0 4 - r 0 2 , welche vom Radius 
r 0 abhangt. 

Die Natur der Transformation in diesem Falle zu bestimmen, ist 
eines der interessantesten und schwierigsten Probleme der Mathematik. 
Die wesentlichc Frage ist folgende: Bleiben bei Wiederholung der 
Transformation T alle Punkte P in der Nahe von (o, o) immer in der 
Nahe ? Dies ist das Problem der dynamischen Stabilitat im einfachsten 
Falle, welches nocli heute vollig ungelost ist. 

In dicser Richtung mochte ich noch eine Bemerkung hinzufugen. 
Durch die Natur der Transformation kann man in diesem Falle zu dem 
Resultat kommen, dab in der Nahe der entsprechenden periodischen 
Bewegung unendlich viele periodische Bewegungen existieren mit der 
gegebenen Konstante der Energie, welche viele Umliiufe wahrend 
einer Pcriode dieser Bewegung machen. Die Art des Beweises ist fol¬ 
gende (natiirlich kann ich nicht alle Einzelheiten geben): Bei grobem 
n und in einer hinreichenden Nahe des (o, o) hat T die Form einer 
Rotation durch einen Winkel n(d -F r 0 2 ). mindestens so, dafi die 
Rotation liings eines Kreises C, r — r 0 , mehr als 2n groCer ist als 
die Rotation liings r o. Dalier konnen wir ein m finden, so daB, 
wenn T mit einer Rotation um einen Winkel —2 mn zusammen- 
gesetzt wird, die Rotation positiv liings r = r 0 sein muB und negativ 
langs r = o. 

Man sieht sofort, daB bei dieser neuen Transformation T n m minde¬ 
stens eine geschlossene Linie C um den Punkt (o, o) existieren muB, 
auf welcher keine Rotation stattfindet. Wegen der flacheninhalts- 
erhaltenden Eigenschaft der T n m mussen C und T n m (C) mindestens einen 
gemeinsamen Punkt P haben. Des weiteren kann direkt bewiesen 
werden, daB C einmal und nur einmal von jedem Radius geschnitten 
wird. Daher muB C n auch diese Eigenschaft haben, weil jeder Punkt 
(J n von C n auf demselben Radius wie Q liegt. Also T„ m (P) muB dann 
mit P zusammenfallen. 

Fiir einen solchen Punkt P haben wir T„ m (P) = P. Dieser Punkt 
entspricht einer periodischen Bewegung, welche m Umliiufe der ge- 


785 



Einigc Problem? der Dynamite 


9 


gebenen periodischen Bewegung macht. Aber n ist willkiirlich groB 
gewahlt, so daB unendlich viele solche periodische Bewegungen in der 
Nahe existieren miissen. 

Der sogenannte letzte geometrische Satz von Poincare war von ihm 
aufgestellt, um das Dasein solcher neuen periodischen Bewegungen zu 
begriinden. Aber unsere Methode zeigt uns, wie in vielen der wichtig- 
sten Falle der Gebrauch dieses Satzes vermieden werden kann. Es 
ist sehr interessant, daB die meisten falschen Versuche, diesen Satz zu 
beweisen, sich auf Betrachtungen ganz ahnlicher Natur wie die obigen 
stiitzen. Aber diese Versuche versagen gerade, weil im allgemeinen 
Fall des Poincareschen Satzes nicht bekannt ist, daB C von einer 
solchen speziellen Art ist. 

VVir wollen nun die folgenden vier Beispiele dynamischcr Sy- 
steme kurz betrachten: a) Billardkugelspiel auf elliptischem Tisch; 
b) Teilchen auf glatter. konvexer Flache; c) Teilchen auf glatter. 
geschlossener Flache von durchaus negativcr Kriimmung; d) Drei- 
korperproblem. 

a) Billardkugelspiel auf elliptischem Tisch. 

Die geodatischen Linien auf einem Ellipsoide mit Halbachsen 
a, b, c (a > b > c > 6) sind seit Jacobi bekannt. Sie erscheinen auch 
als allgemeine Losung eines integrierbaren Hamiltonschen Problems, 
weil ein Teilchen, das sich auf einem glatten Ellipsoide, ohne Ein- 
fluB auBerer Krafte, bewegt. einer solchen geodatischen Linie folgen 
muB. Wenn nun die kleinste Halbachse c gegen Null konvergiert, 
wahrend die anderen Halbachsen a, b konstant bleiben, geht das 
Ellipsoid in eine Ellipse liber. Die geodatischen Linien werden gerade 
Strecken, und zwei solche Strecken.die derselben Linie angehoren und 
einander folgen, miissen die Ellipse unter gleichem Winkel treffen. 
Aber solche gebrochenen Linien sind die idealisierten Bahnen einer 
Billardkugel auf einer Ellipse. Natiirlich muB dieses Problem auch 
,,integrierbar" sein. 

Die geometrische Eigenschaft, welche dieser Integrierbarkeit ent- 
spricht, ist eine wohlbekannte: Zwei einander folgcnde Strecken sind 
immer Tangenten eines und desselben Kegelschnittes, welcher dieselben 
Brcnnpunkte hat wie die gegebene Ellipse. Daher teilen sich alle Be¬ 
wegungen in analytische Familien, gemaB dem entsprechenden Kegel- 
schnitt. 

Die moglichen Bewegungszustande entsprechen alien Punkten der 
Ellipse zusammen mit alien moglichen Richtungen. Wenn also x, y 
die Koordinaten eines Punktes der Ellipse sind und y der Richtungs- 
winkel ist, so gibt jedes ( x , y, y) einen Bewegungszustand. Die Ge- 


786 


George D. Birkhoff: 


IO_ 

samtheit solcher Zustande hat augenscheinlich die topologische Katur 
eines Torus, sobald man von den Zustanden (x, y, y). wo (x, y) au 
der Ellipse selbst liegt, absieht. Aber fur solche Punkte sind (x, y, y) 
und (x, y, y.) als derselbe Zustand zu betrachten, wenn y und y, zwei 
cinander folgenden Strecken entsprechen. Daher ist die Zustands- 
mannigfaltigkeit in diesem Fall eine geschlossene. 

Wir haben also ein integrierbares dynamisches Problem mit ge- 
schlossener Zustandsmannigfaltigkeit. 

Hier laBt sich auch eine Transformation T ausnahmslos definieren. 
Angenommen, d sei eine Variable mit Periode 2 n, welche die Lage eines 
Punktes auf der Ellipse fixiert, und y sei eine Variable, welche die 
Winkel zwischen der zuriickprallenden Billardkugel und der positiven 
Richtung der Tangente angibt. so daB o ^ y ^ n. Fur jedes Paar (t?, y) 
gibt es ein nachstfolgendes ( 0 „ y,). Die Gesamtheit der Bewegungs- 
zustande ( 0 , y), welche den StoBen am Rand entsprechen, bilden eine 
Schnittflache S fur alle moglichen Bewegungskurven, mit Ausnahme 
der zwei rollenden Bewegungen langs der Kurve. Diese Schnittflache 
ist eine ringformige. Wir konnen dann schreiben 

(#i.*Pi) “ T(0,tp). 

Weil dieses Problem integrierbar ist, haben wir auf S geschlossene, 
invariante, analytische Kurven, welche bei T oder 7 2 in sich selbst 
transformiert werden. Alle die Projektionsrichtungen, welche Strecken 
definieren, die einen und denselben Kegelschnitt mit denselben Brenn- 
punkten beruhren, gehorcn einer oder zwei solchen geschlossenen 
Kurven an. Es ist sehr leicht, die topologische Natur dieser Kurven 
zu bestimmen. 

Man sieht sofort, daB es vier Arten von Bewegungen gibt: I. iiber- 
all dichte periodische Bewegungen, welche gewissen dieser Kurven ent¬ 
sprechen: 2. uberall dichte rekurrente aber nicht periodische Lo- 
sungen, welche den anderen Kurven entsprechen und welche den 
allgemeinen Fall im Sinne des Lebesgueschen Integrals bilden; 
3. zwei Familien von Bewegungen, welche die periodische Bewegung 
langs der groBeren Achse asymptotisch annahern, in beiden Richtungen 
der Zeit; diese entsprechen den Bahnen, welche einmal, und daher un- 
endlich oft, durch die Brennpunkte gehen; 4. die zwei rollenden Be¬ 
wegungen im entgegengesetzten Sinne um die Ellipse, welche auch 
periodische Bewegungen sind. Alle periodischen Bewegungen, mit 
einziger Ausnahme der Bewegung langs der kiirzeren Achse und der 
zwei rollenden Bewegungen, sind instabil. 

In dieser Weise gewinnt man eine voile Ubersicht liber die Be- 


787 



Einigc Problemc dcr Dynamik 


II 


wegungstypen und ihre Beziehungen zueinander, wic in einem solchen 
integrierbaren Probleme zu erwarten ist. 

b) Teilchen auf glatten, geschlossenen, konvexen Flachen. 

Die Zustandsmannigfaltigkeit in diesem Falle ist augenscheinlich ge- 
schlossen. Aber es gibt fiir solche allgemeinen konvexen Flachen keinen 
Grund, eine Gruppicrung der Bewegungen in geschlossenen Familien 
zu erwarten, wie beim Ellipsoid. 

Der Bau einer Schnittflache S und einer Transformation T ist hier 
etwas komplizierter als im vorhergehenden Problem. In anschaulicher 
Weise sieht man erstens, daB es eine kiirzeste Lange fiir eine geschlos- 
sene Kurve gibt, so daB der konvexe Korper durch die Kurve gchen 
kann. Eine Kurve dieser Lange muB mindestens einmal um die 
Flache ausgcdehnt werden und in dieser Lage eine geschlossene geo- 
datische Linie G bilden. Alle Bewegungszustande der Schnitte mit 
G geben zwei Schnittflachen geeigneter Art und zugehorige Trans- 
formationen T. 

Eine voile Betrachtung der Bewegungen in einem willkiirlich ge- 
gebenen Fall scheint beinahe unmoglich zu sein, weil man nicht die 
unendlichen Prozesse, welche eintreten, wirklich durchfiihren kann. 
In Wirklichkeit konnten wir eine sehr gute Anschauung davon be- 
kommen, wenn wir nur alle invariantcn Bereiche in S wiiBten. 

Aber es scheint ganz sicher zu sein, daB es im allgemeinen keine 
solchen Bereiche in S gibt. In einem solchen Falle kann man beweisen, 
I. daB es unendlichviele periodische Bewegungen gibt, welche in sich 
dicht sind, 2. daB die asymptotischen Bewegungen aller denkbaren 
Tvpen iiberall in dichter Weise existieren, 3. daB Bewegungen, welche 
in die Nahe aller Zustande kommen, existieren, usw. In dieser Weise 
bekommen wir auch hier einen ziemlich zufriedenstellenden Cberblick 
iiber die Bewegungstypen und ihre Beziehungen zueinander. Natvirlich 
ist die Lage hier viel komplizierter als in einem integrierbaren Falle. 

c) Teilchen auf glatten, geschlossenen Flachen von durchaus nega- 
tiver Kriimmung. 

Um ein solches Beispiel zu bilden, betrachten wir z. B. die Flache 

z 2 = 1 — e 1 sin 2 i x sin 2 \ y (e> 1), 


wo x, y, i rechtwinklige Koordinaten sind, und wo wir annehmen, daB 

alle Punkte , . 

X = 2 R 7 Z, y = 2 ln (k, 1 = o. 1. 2....) 


demselben Punkt der Flache korrespondieren. Diese Voraussetzung ist 
gerechtfertigt, weil alle Transformationen 

x = x + 2 kjz, y = y + 2 In 


788 



12 


Georoe D. Birkiioff: 


die Flache in sich selbst iiberfiihren. Ein Grundbereich ist das Qua¬ 
drat in dor x, y-Ebene 

o ^ x < 2 71 , o ^ y < 2 71 . 

Die weitere Betrachtung zeigt sofort, daC die Krummung durchaus 
negativ ist, mit Ausnahme der Punkte. die an den Quadratseiten liegen, 
und daB die Flache vom Genus 2 ist. Auch gibt es eine und nur eine 
geodatische Linie A B von einem gegebenen Punkte A bis einem ge- 
gebenen Punkt B, welche aus einer gegebenen Kurve A B bei stetiger 
Deformation hervorgeht. In derselben Weise sieht man, daB eine und 
nur eine geschlossene geodatische Linie von gegebenem Typus exi- 
stiert. 

Dieses Beispiel ist von ganz andercr Art als die zwei vorhcrgehen- 
den, weil keine periodische Bewegung vom stabilen Typus existieren 
kann. 

Wie man leicht beweisen kann, gibt es hier nicht nur iibcrall dicht 
periodische Bewcgungen, sondern auch andere Typen von rekurrenten 
Bewegungcn sowie Bewcgungen, die fast alle Zustande annehmen, usw. 
Hier gibt es einen Algorithmic solchcr Art, daB man eine Cbcrsicht 
a Her Bewcgungen gewinnen kann. 

In diesem Fall scheint es nicht moglich zu sein, eine vollstandige 
Schnittflache S mit entsprechender Punkttransformation T zu bil- 
den, aber nichtsdestoweniger ist die Natur der Bewcgungen fast eben- 
sogut bekannt wie in dem integrierbaren Fall. 

Die drei vorhergehenden Beispiele waren von geodatischer Art. Dies 
ist aber keine wirkliche Beschrankung, weil alle gewohnlichen dy- 
namischen Probleme als geodatische Probleme formuliert werden 
konnen. 

d) Dreikorperproblem. 

Hier nehmen wir an, daB die io Konstanten der io bekannten 
Integrale gegeben sind. Wir nehmen weiter an, daB nicht alle drei 
Flachenkonstanten verschwinden. 

Die Zustandsmannigfaltigkeit ist hier als eine otfene von 7 Dimensio- 
nen anzusehen, weil die Koordinaten unendlich werden konnen. Die 
Moglichkeit des ZusammenstoBens aller drei Korper ist ausgeschlossen, 
und das ZusammenstoBen von zwei Korpern erscheint nur als eine 
hebbare Singularitat, wie Sundman zuerst bcwies. 

Eine wichtige Tatsache ist nun die folgende: eine Bewegungskurve 
ditser Zustandsmannigfaltigkeit, welche einen Punkt enthalt, flir den 
alle drei Entfernungen klein sind, muB mit abnehmender oder zunehmen- 
der Zeit ins Unendliche gehen. In dem einzigen, vom qualitativcn Stand- 

789 



Einigc Problcmc dcr Dynamik 


13 


punkt aus schwierigen Fall ist die gesamte Energie nicht groC genug, 
um alle Korper unendlich voneinander zu entfernen. In diesem Fall 
entfernt sich einer und nur einer der drci Korper von den zwei anderen. 
Aus diesen Griinden miissen drei Stromungen vom Unendlichen ins 
Unendliche in der Zustandsmannigfaltigkeit existieren. Man konnte 
glauben, daB im allgemeinen fast alle Punkte dieser ,,See" aus einem 
dieser Strome konimen, um spater in einem ins Unendliche zu gehen. 
Nur gewisse periodische, rekurrente und zu ihnen asymptotische Be- 
wegungen sind vielleicht von anderem Typus. 

Ich habe nur einige der wichtigsten bisher bekannten Eigcnschaften 
fur diese vier Beispiele angedeutet. Aber eine tiefere Betrachtung wiirde 
uns manche andere Resultate geben, die Natur und die Verteilung aller 
moglichen Bewegungen betreffend. 

Nach diesen vorbereitenden Bemerkungen, sind wir in der Lage, 
einige heute noch nicht geloste wichtige Probleme der theoretischen 
Dynamik zu formulieren. 

Am Anfang sprachen wir iiber die Beziehung zwischen Periodizitat 
und Stabilitat. Aber eine solche Beziehung besteht nicht, wenn man 
die Idee der Periodizitat nicht in gecigneter Weise verallgemeinert. 

Es gibt hier zwei Artcn von Verallgemeinerungen. Mit Riick- 
sicht auf unser erstes Beispiel sind die beiden Arten sehr leicht zu ver- 
stehen. 

Erstens bemerken wir, daB im Laufe der Zeit jede Bewegung sich 
den periodischen Bewegungen nahert, wenigstens in diesem speziellen 
Beispiel. Wenn die Zustandsmannigfaltigkeit M eine geschlossene ist, 
gibt es eine geschlossene Menge M x , deren Bewegungen sich alle 
anderen Bewegungen niihern. Um M x genauer zu definieren, betrach- 
ten wir ein kleines Molekul in M. Es kann geschehen, daB dieses Molekiil 
im Laufe der Zeit niemals zu seiner urspriinglichen Lage zuriickkehrt. 
Dann werden wirdieentsprechenden Bewegungen ,,wandernde“ nennen. 
Die Menge M , ist gerade die Menge der nichtwandernden Bewegungen. 
Nun konnen wir die nichtwandernden Bewegungen M 2 in bezug auf 
XI x definieren. So entsteht eine abzahlbare, wohlgeordnete Reihe 

XI, A/,, M . .welche in einer M r = A/ r + 1 endet, wo r eine Ordinal- 

zahl im Sinne von Cantor ist. Im Laufe der Zeit findet sich jede 
Bewegung fast immer in der Nahe dieser Menge M r von zentralen 
Bewegungen. 

Wenn die Zustandsmannigfaltigkeit zweidimensional ist, kann man 
leicht beweisen, daB r <> 2. Aber fur mehr Dimensionen n glaube ich 
nicht, daB r S. n ist. 


790 



14 


George D. Birkhoff: 


Daher werde ich unser erstes Problem formulieren wie folgt: 

Problem I: Ein dynamisches Problem zu konstruieren mit drei- 
dimensionaler, geschlossener Zustandsmannig/altigkeit, worin die Ordinal - 
zahl r der zentralen Bewegungen > 3 ist. 

Es gibt andere wichtige Probleme, welche die Struktur der zentralen 
Bewegungen betreffen. 

Die zweite Verallgemeinerung der periodischen Bewegungen ent- 
steht wie folgt: Keine periodische Bewegung nahert sich einer anderen 
Bewegung. Also konnen wir diejenigen Bewegungen ,,rekurrente“ 
nennen, welche eine minimale, geschlossene Menge anderer Bewe¬ 
gungen annahern, so daB es keine Untermenge derselben Art gibt. Das 
Dasein solcher rekurrenten Bewegungen und ihre quasiperiodischen 
Eigenschaften sind dann leicht zu beweisen. Ein Hauptsatz ist, daB 
jede stabile Bewegung gleichmiiBig oft in die Nahe solcher rekurrenten 
Bewegungen kommt. 

Es gibt auch viele wichtige Fragen iiber die Struktur der rekurrenten 
Bewegungen. Aber die erste und die wichtigste betrifft die periodischen 
Bewegungen allein und kann in folgender Weise formuliert werden: 

Problem II. Im Falle der Hamiltonschen Probleme (2) von zwei 
Freiheitsgraden, mit geschlossener Zustayidsmannigjaltigkeit und mit 
mindestens einer stabilen periodischen Bewegung, die Vberalldichl- 
heit der periodischen Bewegungen zu beweisen. (Hier hat die Energie- 
konstante einen gegebenen Wert.) 

Wenn diese Vermutung richtig ist, so ist jede Bewegung bei einem 
solchen Hamiltonschen System von zwei Freiheitsgraden immer in 
der Nahe periodischer Bewegungen. Ich glaube nicht, daB dasselbe 
im Falle mehrerer Freiheitsgrade gilt. Aber ich glaube, daB die re- 
kurrenten Bewegungen iiberall dicht sind. Daher werde ich das dritte 
Problem in folgender Weise formulieren: 

Problem III. Im Falle aller Hamiltonschen Probleme mit ge¬ 
schlossener Zustandsmanniglaltigkeit die Vberalldichtheit der rekurrenten 
Bewegungen zu beweisen. 

Esgibt noch eine andere Frage, welche die periodischen Bewegungen 
betrifft: Ist es moglich, eine brauchbare Erweiterung des letzten 
Poincareschen Satzes in allgemeineren Fallen zu finden ? 

Hier miissen wir einige vorbereitende Bemerkungen machen. In 
der urspriinglichen Form dieses Satzes kam in Betracht die Trans¬ 
formation T eines zweidimensionalen Ringes in sich selbst. Fiir eine 
Anwendung dieses Satzes bei einem gegebenen dynamischen Problem 
war es darum notig, eine voile Schnittflache S zu finden, deren Be- 
randung aus zwei periodischen Bewegungskurven besteht. Aber im 

791 



15 


Einige Probleme der Dynamik 

Falle mehrerer Freiheitsgrade existiert eine solche Schnittflache nicht, 
wenn nicht eine geschlossene invariante Familie von Bewegungskurven 
vorhanden ist. Aber das Dasein einer solchen Familie ist nicht zu er- 
warten. Fiir eine mogliche dynamische Anwendung miissen wir daher 
eine Erweiterung des Satzes finden, welche nur eine Transformation T 
in der Nahe eines Fixpunktes betrifft. Solche Transformationen sind 
immer vorhanden. 

Nun miissen wir auch den Typus der Transformationen T bestim- 
men, welche aus dynamischen Problemen entstehen. Die flacheninhalts- 
erhaltende Eigenschaft, welche charakteristisch fiir den einfachsten 
Fall ist, laCt sich verallgemeinern, weil, wie oben bemerkt, T eine 
volumenerhaltende Transformation im allgemeinen Fall ist. Aber diese 
Eigenschaft ist in keiner Weise charakteristisch. Um eine charak- 
teristische Eigenschaft zu formulieren, betrachten wir irgendein geoda- 
tisches Problem. In der n-dimensionalen Flache, auf welcher dasTeil- 
chen sich bewegt, konnen wir eine n — i-dimensionale Flache kon- 
struieren, welche die gegebene geschlossene geodatische Linie in einem 
Punkt schneidet. Es kann nun geschehen, daB eine und nur eine geoda¬ 
tische Linie in der Nahe existiert, welche einen Punkt (*„ .... x n . x ) 

dieser Flache mit einem nachstfolgenden Punkt (*,'.*n i') der- 

selben Flache verbindet. Dann sind (x x , . . ., x n _ x , x Xl . . ., x n _ x ') als 
Koordinaten einer gewohnlichen Schnittflache zu betrachten; hier 
hat die Energie H des Teilchens und somit seine Geschwindigkeit 
einen vorgegebenen Wert, so daB die Zustandsmannigfaltigkeit eine 
2 n — 2-dimensionale ist. Bei der Transformation T geht ein Punkt 

(*i.* n -i') der Schnittflache in einem Punkt (x x \ . . ., x n _ x ") iiber. 

Die Lange Q(x x , ..x n _ x ) der geodatischen Linie, welche (x Xl .. 
und (x x . x n _ x ') verbindet, besitzt dann die weitere extremale Eigen¬ 

schaft, daB 

d [Q (x x -- -f Q (x x \ . . *»_,")] = o 

ist, wo x Xl . . ., x n _ x zu variieren sind. Diese n — i Gleichungen defi- 

nieren die Koordinaten x x ' . x n _ x " mittels x lt .... und damit 

eine Transformation T. 

Beim Gebrauch einer endlichen Anzahl solcher Hilfsflachen kann 
man immer T im geodatischen Problem definieren wie folgt : Es gibt 
k Funktionen 


•••• *n-i), .... •••» Q k = G(xf- l \ x n . x ^), 


so daB die Gleichungen 

d(Q x -+ Q k ) = o 


792 








i6 


GKORf'K I». BirkiiOfp: Einigc Probleme der Dynamik 


gclten und die Transformation T dcfinicren. Hier variieren alle 

Variabeln, auBer x,-- und *,<*>.*—1 (4) - 

VVir werden irgendeine solche Transformation T eine ..konservative 
Transformation*' T nennen. So entsteht das folgende Problem: 

Problem IV. Bet gegebener konservativer Transformation T das 
Dasein eines entsprechenden Hamiltonschen Systems, insbesondere von 
geoddtischem Typus. zu beueisen. 

Wenn diese Vermutung richtig ist. dann muB eine wirklichc Verall- 
gemeinerung des letzten Poincareschen Satzes gerade als Aqui- 
valent einen Satz iiber das Dasein von anderen geschlossenen, geoda- 
tischen Linien in dor Nahe einer gegebenen geodatischen Linie liefern. 
Hier muB naturlich die gcgcbene geschlossene. geodatische Linie void 
stabilen Typus sein. 

\Vir konnen nun die Frage nach einer moglichcn Verallgemeinerung 
des Poincar eschen Satzes wie folgt formulieren: 

Problem V. Sei T irgendeine konservative Transformation mit einem 
Fixpunkt P vom stabilen Typus. Es sind dann Bedingungen zu bestimmcn, 
so dap unendlichviele Punk/e P„ in der Nahe existieren miissen, welche 
Fixpunkte von T m sind. 

Endlich muB ich das wichtige und sehr schwierige Problem der 
Stabilitat in seiner einfachsten Form formulieren: 

Problem VI. In dem Falle zweier Freiheitsgrade das Dasein von 
dynamischen Systemeu zu beweisen, die eine periodische Bewegung vom 
stabilen Typus besitzen, welche jedoch nicht «virklich stabil ist. 

Die oben gegebenen Problem© bet ref fen nicht spczielle dynamische 
Systeme, sondern die allgemeinen. Es gibt aucli viele andere inter- 
essante Probleme, welche gewisse wichtige spczielle Problem© be- 
treffen. Hier mochte ich nur zwei solcher Probleme betonen: 

Problem VII- Im allgemeinen Dreikorperproblem die topologische 
Natur der Zustandsmannigfaltigkeit zu bestimmen. 

Problem VIII. Die Nichtintegrierbarkeit des Dreikorperproblems 
in der Nahe einer periodischen Bewegung vom stabilen Typus zu be - 
weisen. 

Es ist zu bemerken, daB die Resultate von Poincare nur die Un- 
moglichkeit einer gewissen gleichmaBigen Integrierbarkoit bei ver- 
anderlichen Massen beweisen. 

Nach meiner Meinung sind die oben gegebenen Probleme gerade 
diejenigen, deren Losungen groBe weitere Fortschritte ermoglichen 
wiirden. 


(Eingcgangen am 7. S. 1926.) 
793 



Reprinted from Bull. Amer. Math. Soc. f June, 1932, Vol. 38, pp. 361- 
379. 


PROBABILITY AND PHYSICAL SYSTEMS* 

BY G. D. BIRKHOFF 

1. Introduction. My aim today is to lay before you some re¬ 
cent developments in a mathematical field which owes its very 
existence to the problems of the physicist and astronomer, 
namely that of ordinary differential equations, and to point out 
the application of these developments to the theory of physical 
systems. 

2. The Law of Uniformity and Ordinary Differential Equa¬ 
tions. Any physical system whose state at any time t is fixed by 
n real variables Xi, x%, • • • , x„ evidently satisfies a set of ordi¬ 
nary differential equations of the form 

(E) -JT " • • • » *")> (*=!,•••; n), 


which embodies its fundamental law of uniformity. In the case 
of a Lagrangian physical system, for instance, the n = 2m co¬ 
ordinates X{ are the m geometrical coordinates and their m re¬ 
spective rates of change; in the case of a Hamiltonian system, 
the coordinates are the m geometrical coordinates and the m 
corresponding momenta. 

I propose to restrict attention to physical systems of the 
above type E involving a continuous time / and a finite number 
of degrees of freedom, and thus to forego all consideration of 
physical systems with a discontinuous time t or an infinite num¬ 
ber of degrees of freedom. 

3. The Three Main Types of Physical Systems. For conven¬ 
ience we shall divide such physical systems into three main 
types: (a) the general non-recurrent systems in which only ex¬ 
ceptionally the system recurs to the vicinity of an arbitrary 
initial state; (b) general recurrent systems; (c) variational sys¬ 
tems derived from a variational principle. 

Presented under the title Stability and instability in physical systems be¬ 
fore a joint meeting of the American Mathematical Society and the American 
Physical Society at New Orleans, December 29, 1931. 


361 

794 



362 


G. D. BIRKHOFF 


[June, 


4 Example of a Non-Recurrent System. As a simple example 
of the non-recurrent type, we shall consider briefly the follow- 

• * 

Imagine a mass particle which moves in a line subject to an 
arbitrary force which depends only on position and velocity. If 
x and / denote the positional coordinate and the time respec¬ 
tively, the differential equation may be written 



or in the equivalent form E with n = 2 


dx 

dt 


y>- 


dy 

dt 


= /(*, y)- 


For definiteness, let us assume that / is analytic and that there 
is one and only one equilibrium position for the particle, say at 

the origin x = 0. . _ 

What can be said about the motion of the particle? In an¬ 
swering this question it is convenient to use a geometric repre¬ 
sentation in the *, y plane. Each motion is represented by a 
curve *=/(/), y=dx/dt =/'(/), with velocity components 
dx/dt — y and dy/dt, so that in the upper half plane the point 
(x, y) moves to the right, in the lower half plane to the left, and 
along the *-axis vertically. 



If now we divide the motions into the stable motions for 
which x and dx/dt—y remain finite for />0, and the unstable 
motions, we can readily prove that a stable motion of the par¬ 
ticle is periodic, or tends asymptotically towards a periodic 
motion, or towards the equilibrium p osition, with expanding or 

• See my book on Dynamical Systems, 1927, Chap. 5. This book will be 
referred to hereafter as D.S. 


795 



PROBABILITY AND PHYSICAL SYSTEMS 


363 


i93*-l 


diminishing oscillations. This is true of motions stable in the 
past as well of those stable in the future. 

Similarly, unstable motions oscillate with wider and wider 
swings in such a way that either the amplitude or the velocity 
or both become indefinitely large. All this follows from simple 
considerations based on the analysis situs of the plane of the 
kind used by Poincar£ in his early papers on ordinary differen¬ 
tial equations. 

It will be seen then that only for the periodic motions is there 
recurrence. Hence this system is non-recurrent. 

Of course in special cases, when there is an energy integral, 
for instance, all of the motions may be periodic. In such cases 
the system is recurrent of course. 

5. The Central Motions of Non-Recurrent Systems. In the gen¬ 
eral case of non-recurrent systems in n variables, the situation 
is similar. For definiteness we shall only consider the closed case 
when xi ,•••,*„ may be considered as a point of a closed n- 
dimensional space M. 

This requirement will be realized, for example, if a particle 
moves on the viscous surface of a sphere subject to an arbitrary 
force, so that it moves in the direction of the field of force with 
velocity which is a function of position on the sphere. 

Now if we consider all of the points of M, the differential 
equations E define a steady flow in M. In case there is no tend¬ 
ency towards recurrence, a “molecule” of this fluid will not in 
general overlap later its initial position. Furthermore, if it does 
not as time increases, it cannot do so as time decreases. For 
imagine that the molecule overlaps its first position at time 
/=■ — r. Let time increase by r. Clearly the position at / - — r 
of the molecule will be restored to the initial position, and the 
initial position to that at time t = r, and the overlapping will of 
course persist. Thus, in the case of non-recurrence, there is a 
doubly infinite non-overlapping tube formed by the molecule 
of non-recurring motions. The set of limiting motions M\ of such 
non-recurring motions may now be treated in the same way with 
reference to recurrence and this leads to a subset M 2 of Mi, and 
so on. Thus we obtain an ordered sequence of sets of motions 
Mi, M 2 , • • • , which terminates in the “central motions” M c , 
as I have called them.* 

• D.S., Chap. 7. 


796 




364 


G. D. BIRKHOFF 


ljunc, 


In the case of the particle on the surface of the sphere, we 
might have the possibility illustrated below in one hemisphere, 
in which the limiting great circle together with the equilibrium 
point E( forms the set M u and the three equilibrium points E 0 , 
Ei, Ei form the set M 2 of central motions. 



The two fundamental properties of the central motions are 
the following: the set of central motions is recurrent in type; the 
time-probability is 1 that any motion of a physical system is 
arbitrarily near the set of central motions. 

Hence in closed non-recurrent systems the motions will lie (in 
the sense of time-probability) arbitrarily near the set of central mo¬ 
tions of recurrent type. 

If the “energy” of a physical system is dissipated indefinitely, 
it is clear that the corresponding differential equations will have 
as central motions the equilibrium states of the system. 

6. A Remark Concerning Atomic Systems. If we consider an 
atomic system as a dynamical system with n degrees of freedom, 
at least to a certain degree of approximation, it would seem 
probable that these equations must be of non-recurrent type and 
not of the usual recurrent type of classical dynamics. In fact, if 
the central motions were a set of periodic motions, the system 
would always tend towards one of these and, when disturbed, 
would revert to the same periodic motion, or to a different peri¬ 
odic motion if the disturbance were large enough. The relative 
amount of time spent at any considerable distance from these 
periodic motions would be very small. Evidently this general 
situation would agree qualitatively with the facts of radiation. 

7. The Recurrence Theorem of Poincare. Suppose next that the 
corresponding differential equations E are valid in an rz-dimen- 
sional closed space M, and that the n-dimensional volume of 
any molecule is not changed as the time changes. Then there 


797 



PROBABILITY AND PHYSICAL SYSTEMS 


365 


>932-1 


will necessarily be recurrence, as follows from a classical reason¬ 
ing due to Poincare.* 

In fact consider a small molecule of the fluid at time 1 = 0 and 
at successive intervals / =r, 2r, - • • . If the volume of the mole¬ 
cule is v, and the total volume of M is V, and if n successive 
positions do not overlap, we must have nv ^ V. Thus overlap¬ 
ping between some *th and jth position must recur for i<j^n 
if n > V/v. But it follows then that the 0’ — *)th position overlaps 
the original position; that is, there is recurrence. 

By refinement of this mode of argument, Poincare proved a 
result which in present-day terminology may be stated as fol¬ 
lows. All of the motions corresponding to curves which traverse 
the given molecule recur to it infinitely often in past as well as 
in future time, except for a set of measure zero in the sense of Le- 
besgue, that is, a set which can be enclosed in a numerable set 
of volumes whose total volume is arbitrarily small. 

For a recurrent system all of the motions are central motions; 
that is, M = JV/i — M e . We shall only consider recurrent systems 
in which such an invariant volume integral exists. 

8. Probability and Physical Systems. From the standpoint of 
the physicist it is not the specific motions that are of interest 
since the initial conditions are not precisely determinable. 
Nevertheless the mathematician has concerned himself largely 
with properties of highly improbable special motions such as the 
periodic motions rather th-.n with the general motions of dy¬ 
namical systems. 

As far as I know it was Koopman, among mathematicians, 
who first emphasized the importance of getting away from the 
exclusive consideration of those “properties which are changed 
altogether by an infinitely small change in the physical condi¬ 
tions attendant on the problem, or by the slightest change in 
initial data”f and of obtaining results which are valid in general. 

Evidently results of this type are likely to bring in considera¬ 
tions of probability. Indeed Poincar6 stated his result in the form 


• Les Mtihodes Nouvelles de la Mecanique Celeste, vol. 3. See also C. Cara- 
th6odory, t'ber den Wiederkehrsalt von Poincare, Berliner Sitzungsberichte, 
1919. 

f Birkhoff on Dynamical Systems, this Bulletin, vol. 26 (1930), p. 165. 


798 



366 


G. D. BIRKHOFF 


(June, 


that the probability of recurrence is one under the conditions 
specified. His theorem is then a step in the desired direction. 

9. The “Ergodic Theorem .” The theoretical physicist has long 
emphasized the importance of considerations of probability in 
this field, and on an intuitive basis has formulated vaguely cer¬ 
tain types of theorems, one of which in precise form is the follow¬ 
ing "ergodic theorem” (as I shall call it). In a closed recurrent 
system there is a definite time-probability that the moving point not 
belonging to a certain set of measure zero finds itself in a given 
region of the space M. 

Very recently von Neumann,* * * § using an important abstract 
formulation due to Koopman.f of the dynamical problem in 
terms of linear operations in function space has succeeded in 
showing that probability considerations can be carried further. 
His treatment has been given a simplified form by E. Hopf.J 
Von Neumann’s result establishes that a similar "mean ergodic 
theorem” holds in the highly technical sense of "convergence in 
the mean,” although throwing no light on the existence of a 
time-probability along the individual motions. Nevertheless 
his result marks a vital step in advance; in fact it is sufficient 
to solve the statistical problem of classical kinetic theory, pro¬ 
vided that the hypothesis of metrical transitivity (§13) be 
granted, and is of the first order of importance in the realm of 
ergodic theory. 

Shortly thereafter, stimulated by von Neumann’s result, I 
succeeded in proving by entirely new methods that the ergodic 
theorem holds in the ordinary sense of time probability.§ 


• Proof of the quasi-ergodic hypothesis. Proceedings of the National Academy 
of Sciences, January, 1932. 

t Hamiltonian systems and transformations in Hilbert space. Proceedings of 
the National Academy of Sciences, May, 1931. 

t See a note On the time-average theorem in dynamics, Proceedings of the 
National Academy of Sciences, January, 1932. For other recent work in the 
same direction, see the same journal, E. Hopf, Complete transitivity and the 
ergodic principle, February; B. O. Koopman and J. von Neumann, Physical 
applications of the ergodic hypothesis, March; E. Hopf, Proof of the Gibbs hy¬ 
pothesis of the tendency towards statistical equilibrium, April. 

§ Proof of a recurrence theorem for strongly transitive dynamical systems and 
Proof of the ergodic theorem. Proceedings of the National Academy of Sciences, 
December, 1931. See also the March issue of the same journal, A. Wintner, 
Remarks on the ergodic theorem of Birkhoff. 


799 


*932-1 


PROBABILITY AND PHYSICAL SYSTEMS 


367 


A special case of the ergodic theorem is the following: If an 
(« — l)-dimensional open or closed surface a in M cutting across 
the curves of motion in one sense be considered, the curves of 
motion will have a definite mean time of crossing of <r, the same 
in both directions of time, except for a set of points of Lebesgue 
measure 0. 

This mean crossing-time theorem states then that the follow¬ 
ing limit exists for every motion except those of a set of measure 
0 : 

lim — = r/». 

h—± • n 

Here t n denotes the elapsed time to the nth crossing of a from P 
on a. Evidently this is a stronger form of the recurrence theorem 
of Poincar6. 

As a simple application, consider any convex billiard table, 
and a chalked line l on the table. In general the idealized bil¬ 
liard ball will cross this line with a certain perfectly definite 
mean time of crossing (which may be infinite). It is known that 
the billiard ball problem is of recurrent type. 

If a certain further condition of metric transitivity (Section 
13) is satisfied, this mean time can be easily determined and is 
the same for all of these motions. On the other hand in special 
cases (that of an elliptical billiard table for instance) the mean 
time will not in general be the same for different motions. 

10. Remarks on the Ergodic Theorem. In order to make plain 
the nature and scope of the ergodic theorem as applied to re¬ 
current systems, certain remarks need to be made. 

In the first place, the theorem applies to physical systems of 

In proving that the limit is the same for /■—«>, my argument in the second 
note is not properly formulated. However, if the last displayed inequality 
(p. 659) did not hold, there would be a measurable invariant set 5 for which 
the stated limit is either greater than *i or is less than X for / = — «. Application 
of the lemma would then give 

x fs"*fs* n "'S*f i *F 

together with 

J- t(P)dP > n j-dP or is < X f- dP 

which is not possible. 


800 




368 


G. D. BIRKHOFF 


(June, 


classical dynamical type with given energy constant, in case the 
w-dimensional space M is closed. For such systems possess the 
invariant volume integral which necessitates recurrence. 

In the second place, when M is open, the theorem applies 
also to any set of the motions which remain away from the 
boundary of M , provided that there is such an invariant volume 

integral. . 

In the third place, the ergodic theorem may either be ex¬ 
pressed in terms of selected volumes in M, or in terms of an ar¬ 
bitrary function/(P) of position on M ; in the latter case it states 
that f(P) will (in general) have a mean value in time along any 
curve.* The two formulations are at bottom equivalent. It is 
hardly necessary to state then that the ergodic or time-average 
theorem permits of innumerable applications. 

The exceptional motions left out of account are in general 
everywhere dense although of measure 0. An analogous case is 
that of the rational points x = m/n expressed by proper rational 
fractions. These are everywhere dense on the segment 0<x ^ 1 
of the line, but are of measure zero, as is well known. For, put 
an interval of length e n /n about each such point. There are 
on l y n _i points with a given n, so that these are enclosed by 
intervals of total length less than e n . Hence all of the rational 
points are enclosed in a set of total length c/(l—e) at most, 
which can be made arbitrarily small. 

In order to bring out the significance of the theorem let us 
make an application. 

Suppose that, in an idealized Sun, Earth and Moon system, 
these bodies are moving in plane orbits according to Newton’s 
law. Consider the motions of this configuration which are stable 
in the sense that the following inequalities hold for all time: 

R > r SE > R' > r > r B M > r' > 0, 

where r SE and r E \t designate the distances from the Sun to the 
Earth and from the Earth to the Moon, respectively. It follows 
rigorously from the ergodic theorem that, if the probability of 
such stability is not 0, there is a true mean rate of relative rota¬ 
tion of Sun and Moon about the Earth except in an infinitely 
improbable case. 

• See Hopf, loc. cit. 


801 


*932-) 


PROBABILITY AND PHYSICAL SYSTEMS 


369 


In fact the totality of such motions forms a measurable set. 
If this set is of measure zero, the probability of stability is 0. 
Otherwise the ergodic theorem may be applied in the manner 
indicated to this set of positive measure. 

The ergodic theorem tells us nothing about periodicity pro¬ 
perties. This may be explained by analogy as follows. If we write 
down an arbitrary infinite decimal, the limit of the sum of the 
first n figures divided by n will tend towards 9/2, the average of 
0, 1, • • • , 9. This fact tells us nothing about the recurrence of 
sequences of figures in a particular infinite decimal. 

11. Regional Transitivity. If now we proceed further with the 
classification of closed recurrent systems, they may be separated 
to begin with into the transitive type, when an arc of a curve of 
motion can be found (perhaps very ]ong) which joins a point P 
near an arbitrary point P to a point Q near an arbitrary point Q, 
and the intransitive type when this is not true. For transitive 
systems there exist truly “general motions” which pass in¬ 
finitely often arbitrarily near all points of M in the future and 
in the past alike.* Such motions may be called “general mo¬ 
tions,” in contrast to the “special motions,” not having this 
property. 

In the case of such regional transitivity, it is not known, 
however, whether the general motion is or is not general in the 
sense of probability. 

12. Example of a Closed System ivith Regional Transitivity. I 
have shown elsewhere that there exist transitive recurrent sys¬ 
tems of classical dynamical type with a closed manifold M. f 
These are formed by the geodesic lines on certain closed two- 
dimensional surfaces of everywhere negative curvature; such 
lines are of course the curves of motion of a particle constrained 
to move in the surface but not acted upon by any force but the 
force of constraint normal to the surface. The existence of such 
systems can be rendered intuitively evident in a special case as 
follows. 

The doubly periodic surface 

z * = 1 - e* sin 2 $x sin 2 Jy, ( | <? | > 1), 

• D.S., Chap. 7. 

t D.S., Chap. 8. 


802 



370 


G. D. BIRKHOFF 


IJunc, 


has everywhere negative curvature except at the points for 
which x and y are both multiples of 2tt, where the curvature is 
zero. Suppose now that we consider the straight lines x= ±7r 
and y = ±7r to be joined at corresponding points (conceptually) 
without distortion of distances on the surface. A surface having 
the connectivity of the double anchor ring is obtained. 

Now any such surface of negative curvature has the property 
that if an inextensible string be wound on the surface from a 
point P to Q and pulled taut, it will take a determinate position, 
namely along the unique geodesic joining P to Q. This geodesic 
can be characterized topologically by the method of winding the 
string on the surface, and variation in the position of the two 
p and Q has very little effect on the intermediate position of the 
geodesic.* Hence if an infinite string be wound successively in all 
possible combinations of more and more complicated types, one 
obtains a geodesic which must approach all possible directions 
and all possible points. This shows that the system is transitive 
in the regional sense. 

These transitive systems are of Hamiltonian type but a de¬ 
tailed analysis shows that they possess no formally stable peri¬ 
odic motions. 

13. Metrical Transitivity. The important idea of metrical 
transitivity may be defined as follows. If every measurable set 
of curves of motion is either of measure 0 or of the measure V 
of the volume of M, the system is said to be metrically transi¬ 
tive. 

Systems which are metrically transitive are regionally transi¬ 
tive, but the converse is not true. In the case of metrical transi¬ 
tivity the curves of motion are so inextricably intermixed that 
sets of motion of positive measure cannot be separated off. 

The importance of this idea arises from the fact that, almost 
certainly, recurrent physical systems are in general metrically 
transitive, although this is very difficult to prove. 

In the case of metrical transitivity, the ergodic theorem takes 
the simpler form that the time-probability is given by the ratio 
v/V of the volume v in question to the volume V of M. 


* The proof of these statements involves reasoning analogous to that used 
in other connections by Hadamard and Morse; see D.S., Chap. 8. 


803 


I932 .j PROBABILITY AND PHYSICAL SYSTEMS 371 

A very simple example of a metrical transitive system,* with 
demonstration,t is as follows. 

Consider the pair of differential equations 

d<t> a df . 

— = —) — =1, (ex irrational), 

dl 2 tt dt 

where <t>, i* are angular variables of period 2i r. Here the manifold 
M is a two-dimensional torus, and the stream lines are of the 
form 27 t(0 — <t>o) = a(^ — ^o). A measurable set of curves will 
correspond to a measurable point set on the “circle” \p = 0, given 
by the points of intersection of these stream lines with ^ = 0. If 
we define f(<f>) as 1 or 0 according as the point of this circle with 
coordinate <t> does or does not belong to this measurable set, 
this function / admits a formal Fourier expansion. 

• • • + c-itr* 4- co + c x t* 4- • • • , (i“\/-I). 

But after 2tt seconds this point set has moved into itself, each 
point moving along its curve to a new position 4>+cx. Thus the 
series 

• • • c-\tr ai t-* 4- Co 4- Ci^'e* 4“ • • • 

represents the same function/(</>). Hence C\, c_i, c it C- 2 , • • • are 
0 (since a is irrational); and the development of/(0) reduces to 
a constant. But this would mean a measurable set of constant 
density, that is, either of measure 0 or of measure 27r, as is to 
be proved. 

In all likelihood a proof of metrical transitivity in general will 
be exceedingly difficult, since the “problem of stability” (Section 
17) must first be solved, and this problem has so far defied solu¬ 
tion. However in the special system of geodesics on a closed sur¬ 
face of negative curvature, I believe that metrical transitivity 
can be demonstrated without excessive difficulty, since a com¬ 
plete algorithm exists for the effective treatment of this special 
type of dynamical system.J 

• G. D. Birkhoff and P. A. Smith, Structure analysis of surface transforma¬ 
tions, Journal de Math6matiques, vol. 7 (1928). The definition of metrical 
transitivity appears in this paper. In this connection, see also P. A. Smith, 
The regular components of surface transformations, American Journal of Mathe¬ 
matics, vol. 52 (1930). 

t This proof was discovered independently by Koopman and by Hopf 
(loc. cit.). 

X D.S., Chap. 8. 


804 



372 


G. D. BIRKHOFF 


fjune. 


14. Instability of Periodic Motions in Recurrent Systems. Be¬ 
fore passing on to variational systems it is desirable to point out 
an important characteristic of the periodic motions of recurrent 
systems. The linear differential equations of variation along 
such a motion have periodic coefficients with the period of the 
motion. There will be in general « particular solutions such that 
the jth solution is multiplied by a certain constant real or imagi¬ 
nary, pj t when / increases by this period. In the recurrent case the 
product of these n roots will be 1, but there is no other condition 
upon the constants py. Thus there is no reason, except in the 
case n = 2, to expect that the characteristic multipliers occur in 
reciprocal pairs. Hence small perturbations from periodic mo¬ 
tion will not in general remain small in either direction of time, 
since some of the characteristic multipliers will be less and some 
will be greater than 1 in absolute value. 

In fact it may be shown that, unless infinitely many further 
conditions are satisfied, there will be instability in both direc¬ 
tions of time along such periodic motions. As we shall see, this 
lack of formally stable periodic motions is correlated with a 
lrfuch more rapid circulation of a general point P in the space 
M than is possible when any periodic motion of formally stable 
type is present. 

15. Variational Systems. A string stretched on a smooth sur¬ 
face falls along a geodesic which is the path of a particle moving 
freely in the surface. This simple fact indicates that the curves 
of motion of such a particle are obtained from variational equa¬ 
tions. When such equations are expressed in proper coordinates 
p it q it they take the usual Hamiltonian form, with a Hamilton¬ 
ian function H representing the total energy. The variational 
principle for these variational systems is then 

si = « f Em.' - a\dt = o, 

J t 0 

where t 0 and t\ are fixed, and p it q t have given values for t = t 0 
and t = ti. The curves along which 51 vanishes are precisely the 
solutions of the Hamiltonian equations. 

Such sets of equations possess an invariant 2m-dimensional 
integral Jdp x • • • dq 2m . By means of the known energy integral 
H = c onst. f this system may be reduced to one of order 2m—1 
which possesses a corresponding (2m — l)-dimensional invariant 


805 



*93»-J 


PROBABILITY AND PHYSICAL SYSTEMS 


373 


volume integral. Hence if H = const, forms a closed space M, 
the physical system will be recurrent for such a value of the 
energy constant. 

If now we make a general transformation from p it q> (i = 1, 

• • ■,m)tox i , • • • , X 2 m, the variational principle takes the form 

SI = « f *’{ T.X<xl - H\dt = 0, 

J t 0 

and we obtain a corresponding set of differential equations 
which I have called Pfaffian.* 

This form has the advantage that it is left invariant under a 
perfectly arbitrary transformation of the 2m dependent variables, 
which involves 2m arbitrary functions, whereas the Hamil¬ 
tonian form is only left invariant under certain contact trans¬ 
formations. 

It appears highly probable that not only can one pass from 
the special Hamiltonian form to the Pfaffian, but also that in¬ 
versely one can reduce the Pfaffian form to the Hamiltonian 
form. In fact this has already been established by Feraud in 
certain cases.t If this conjecture be true, it must be regarded 
as a mere exercise in analytic ingenuity to employ only Hamil¬ 
tonian systems. 

16. Trigonometric Stability. In the mathematical treatment of 
the solar system, it appeared step by step, following Newton, 
Laplace, and Poisson, that the motion was expressible at least 
to terms of the second order by means of trigonometric series. 

Poincar6 was the first, however, to show why, for any Hamil¬ 
tonian system, it is true that all of the infinitely many conditions 
for such trigonometric stability are automatically satisfied along 
any periodic motion, as soon as the usual first-order conditions 
for stability are fulfilled, namely that the multipliers are distinct 
imaginary quantities of modulus 1, but are not roots of unity. 

For such trigonometrically stable periodic motions, a point P 
of M distant from the corresponding closed curve by at most c 
at / = 0 remains within a distance Lc during a time of order €”*, 
where k is any arbitrarily large positive integer. 

• D.S., Chaps. 2, 3. 

t On Birkhoff's Pfaffian systems, Transactions of this Society, vol. 32 (1930); 
Extension au cas d'un nombre quelconque de degris de libertl d'une propriiU 
relative aux systemes Pfaffiens, Comptes Rendus, vol. 190 (1930), pp. 358-360. 


306 



374 


G. D. BIRKHOFF 


fjune. 


It may be shown that, from a very broad formal point of view, 
systems of variational type, like the Hamiltonian and Pfaffian, 
are the only ones whose periodic motions possess this property.* 
In consequence of this property of trigonometric stability, the 
perturbations of motions near stable periodic motions of varia¬ 
tional systems can be expressed with an extraordinary degree 
of accuracy by means of trigonometric series, even if these be 
actually divergent. Our solar system furnishes an obvious illus¬ 
tration of a physical system of variational type. 

From the above point of view the essential significance of the 
variational systems seems to be this characteristic property of 
trigonometric stability. 

I have also shown (loc. cit.) that variational systems are re¬ 
versible in time from the purely formal point of view, and that, 
conversely, formally reversible systems will enjoy this property 
of trigonometric stability. Thus variational character is inti¬ 
mately associated with reversibility, as well as with trigono¬ 
metric stability. 

17. The Problem of Stability. In all likelihood trigonometric 
stability does not mean actual permanent stability. So far, how¬ 
ever, the most arduous efforts of mathematicians have failed to 
show the existence of cases in which a motion arbitrarily near 
such a stable motion ultimately deviates from it. As long as this 
possibility is not demonstrated, it will not be possible to prove 
regional transitivity in any physical system possessing periodic 
motions of stable type. 

However, although ultimate instability remains unestab¬ 
lished, I have recently proved that “rings of instability” at 
least do exist.t 

To explain the significance of this fact let us consider the sim¬ 
plest case of a dynamical system with two degrees of freedom 
with coordinates p u q u p 2 , q 2 . Further let us give attention to a 
particular energy level, H—C. 

A periodic motion of stable type is represented by a closed 
curve C in this three-dimensional space II = C. Cut this curve 
by a two-dimensional element of surface a at a point Q. Take an 

• D.S., Chap. 4. 

f Sur Vexistence des regions annulaires d'instability, Annales de 1’Institut 
Henri PoincarS, vol. 2 (1931). 


807 



*932-1 


PROBABILITY AND PHYSICAL SYSTEMS 


375 


arbitrary point P in the surface and follow along the correspond¬ 
ing curve of motion to the first next point Pi on a. We see then 
that the point P is transformed to Pi, that is, Pi = P(P). Further¬ 
more, the point Q is invariant: Q=T(Q). 

Further investigation shows that the transformation is area¬ 
preserving in suitable variables. Conversely any such “con¬ 
servative” transformation T may be associated with a dynam¬ 
ical system of Hamiltonian type. 

In the case when the periodic motion is actually stable, there 
exists an infinite series of areas invariant under T and closing 
down upon Q, as was noted by PoincarS (loc. cit.). This means of 
course that there are tubular regions of complete stream lines in 
M, which close down upon the curve of periodic motion. 

I have shown that the boundaries of such regions form sur¬ 
faces having a certain amount of smoothness. More precisely, 
the invariant curves have the form r=/(0), where r, 0 are polar 
coordinates and / is a continuous periodic function of period 
with limited difference quotient. Furthermore these curves form 
a closed series in a sense which I will not stop to specify.* 

A simple possibility is that there exist rings of instability 
formed by adjacent invariant curves of this description, and this 
case actually arises, as was stated above. For such rings it may 
be proved that points arbitrarily near any point of either boun¬ 
dary ultimately will pass arbitrarily near any point of the other 
boundary. If it could be proved that such a ring can extend to 
the invariant curve r = 0, the problem of stability would be 
solved, in the sense that trigonometric stability would be shown 
not to necessitate actual stability. 

18. The Ergodic Function T(e). In my opinion, the function 
^(0. giving the least time T which elapses before the point P 
of some motion can come within a distance e of every point in 
M, is destined to play an important part in the characterization 
of closed transitive physical systems. I will venture therefore 
to call T(e) the “ergodic function.” For the intransitive recur¬ 
rent systems there will be an ergodic function for each of the 
domains of transitivity into which M is divided. 

According to the results stated above, in the general varia- 

• Surface transformations and their dynamical applications , Acta Mathe¬ 
matics, vol. 43 (1912). 


376 


G. D. BIRKHOFF 


[June, 


tional case with periodic motions of stable type, a point P can¬ 
not leave the €-neighborhood of the corresponding closed curve 
of motion in time less than €~*. Hence in this case the ergodic 
function T(t) increases more rapidly as e diminishes than any 

negative power of «. , , 

On the other hand, a rough estimate of T(e) may be at¬ 
tempted in the case of geodesic motion on a closed surface of 
negative curvature when there are periodic motions of unstable 
type only. This may be done in the following approximate and 
non-rigorous fashion. 

According to the algorithm referred to, a complete geodesic 
may be associated with a doubly infinite sequence 

• • • A_ 2 a _idoaidt • • • , 

where the a’s are chosen at pleasure out of a set of N letters 

each of which represents a fundamental circuit on the surface. 
Thus the totality of motions is represented by the totality of 
these symbols. The symbols resemble those of ordinary infinite 
decimals, except that instead of 10 numerals there are N letters 

extending to left as well as to right. 

If we select one letter as a 0 of this symbol, we are fixing upon 
that segment of the geodesic which arises from the correspond¬ 
ing circuit. 

A finite symbol 

• • • [a_ m • • • a 0 a\ • • • flm] * * * 

•in which the letters more than m places distant from a 0 are un¬ 
specified will correspond to a three-dimensional volume v in the 
three-dimensional closed space M(x, y, <t>), representing the 
points (x, y) together with the directions <t> corresponding to 
each state of motion. This volume is obtained from a given seg¬ 
ment by continuous variation of the end points as far as possible 
without alteration of the given finite symbol. 

There will be N 2m+l of these volumes since there are N 2m+l 
possible finite symbols. These together make up the total vol¬ 
ume V of M. It is natural to suppose then that each of these 
volumes is approximately of the order N~ 2m of smallness. 

Furthermore, these volumes are more or less cylindrical, of 
length approximately l, where / is the mean length of a funda- 


809 


»93*1 


PROBABILITY AND PHYSICAL SYSTEMS 


377 


mental circuit, and so of cross section of order N~ 2m . It would 
seem likely, therefore, that the cross-sectional area is of diam¬ 
eter approximately of order N~ m . Hence, in order that some 
particular point P traverse all of M within a distance e = 
it is sufficient that the corresponding finite symbol contain all 
of the possible sequences of 2m + 1 letters of which there are 
N 2m+l in all. The order in which these occur is immaterial, and 
the number of letters need not be of order higher than N 2m . 

But the time T(e) corresponding to the geodesic segment with 
this finite symbol is of the order of the number of letters. Hence 
T(t ) is of the order of N 2m , that is, of the order of e~ 2 only. This 
is obviously the minimum possible order. 

Thus, in all likelihood, the ergodic function T(e) increases only 
as the (n — 1 )st power of the reciprocal of e in the general closed re¬ 
current case in n dimensions, whereas it certainly increases more 
rapidly than any negative power of < in the variational case pro¬ 
vided a single periodic motion of stable type is present. 

Consequently it is likely that in the general closed recurrent 
case, an arbitrary motion (aside from those of a set of measure 0) 
will traverse all of M within a distance € during a time of order 

€ -(n-I) > 

It seems also to be likely that the function T(c) increases extra¬ 
ordinarily rapidly towards oo as e tends towards 0 in the closed 
variational case when*formally stable periodic motions are present. 

Of course if the problem of stability were solved in the op¬ 
posite sense to that conjectured above, T(e) would become in¬ 
finite for some definite €>0. 

19. On Open Systems. Thus far our attention has been di¬ 
rected mainly to closed physical systems. Similar results are 
valid for physical systems in which the space M of states of mo¬ 
tion is open. In fact consider the motions of such a system whose 
points for />0 are outside of a certain 5-neighborhood of the 
boundary of M. These motions may be termed “stable” and 
form a closed subset of motions of types (a), (b), or (c). 

By letting 5 approach 0 we obtain the totality of stable mo¬ 
tions in the forward sense of time, to which many of the pre¬ 
ceding results can be extended. 

Thus, in case (a), any stable motion is within an arbitrarily 
small distance of a central motion nearly always in the sense of 



378 


G. D. BIRKHOFF 


fjunc, 


time-probability, whereas any unstable motion is within dis¬ 
tance « of a central motion or of the singular states of motion 
corresponding to the boundary of M, nearly always in the same 
sense. 

Even when there is an invariant volume integral a system 
may be non-recurrent of type (a), if M is open and the total 
volume is infinite. This happens in the case of the problem of n 
bodies. Here, in all likelihood, except for a set of motions of 
measure zero, the n bodies will recede indefinitely from one an¬ 
other either singly or in nearby pairs, and recurrence is impos¬ 
sible.* # , a / \ 

Similarly in the recurrent and variational cases (b)T and (c; 

any stable motion in the forward sense is stable also in the back¬ 
ward sense, save for a set of measure 0. To these stable motions 
the ergodic theorem applies with slight modification. The re¬ 
maining unstable motions will be unstable in both directions, 
save for a set of measure 0. Under certain conditions which we 
will not state here, the ergodic theorem may be applied to these 
unstable motions also. 

20. Summary. Thus in the consideration of the various kinds 
of physical systems from the standpoint of probability we find 
three main types. 

(a) Closed non-recurrent systems. Here the motion tends 
towards a set of central motions of recurrent type, so that any 
particular motion is actually within distance € of the central 
motions nearly always in the sense of time-probability. The 
simplest possibility is that in which the central motions are 
equilibrium states or periodic motions. 

(b) Closed recurrent systems. Here there is recurrence because 
of the existence of an invariant volume integral over the space 
M. For the general motion, the “ergodic theorem” ensures gen¬ 
eral time-average properties of these motions, but does not lead 
to explicit evaluations. In the case where there is metrical transi¬ 
tivity, these averages are the same for all motions in the sense of 
probability, and in consequence they can be at once evaluated. 
The case of metrical transitivity is probably the general case. 

• Sec D.S., Chap. 9. 

t The recurrent case is defined as that in which an invariant volume in¬ 
tegral exists and there is actual recurrence of any molecule so as to overlap its 
initial position. 


811 


>93*] 


PROBABILITY AND PHYSICAL SYSTEMS 


379 


There is instability in both directions of time for all periodic 
motions in the general recurrent case and it seems likely that the 
time T(e ) (where 7X«) is the “ergodic function”) necessary for 
some motion to come within distance « of all states of motion, 
will be of approximate order 

(c) Closed variational systems. Here there is recurrence and 
the ergodic theorem holds as in the general recurrent case. The 
chief difference between this case and the general recurrent case 
from the general physical point of view is that 7X*) increases 
more rapidly than any negative power of e. This is because of 
the presence in general of periodic motions possessing trigono¬ 
metric stability, so that motions near such a periodic motion re¬ 
main nearby during an extremely long interval of time. 

Analogous results can be obtained for open systems of types 
(a), (b), or (c) except that it is necessary to deal separately with 
the unstable motions for which the point of M approaches 
arbitrarily near the boundary of M. 

The outstanding problem concerning physical systems from 
the point of view of probability is that of determining to what 
extent recurrent systems are transitive. It is probable that in 
general there is metrical transitivity. It would be a distinct ad¬ 
vance even to establish that there is metrical transitivity in the 
case of the geodesics on closed surfaces of negative curvature. 

The explicit evaluation of the time-averages whose existence 
is affirmed by the ergodic theorem cannot be made in a given 
case until the precise nature of the transitivity (or intransi¬ 
tivity) of the motions has been determined. 

Harvard University 



Reprinted from the Proceeding* of the National Acadmy or Scibncks. 
Vol. 19. No. 3. pp. 339-344. March. 1933. 


SOME REMARKS CONCERNING SCIIRODINGER'S WAVE 

EQUATION 

By George D. Birkhoff 
Department of Mathematics, Harvard University 
Communicated January 30. 1933 

In the present note I propose to approach the wave equation of Schrd- 
dinger by a method which, although closely related to methods used by 
dc Broglie and Brillouin, 1 Schrodinger* and Dirac, 3 is distinct from these 
and has the advantage of fixing the position of the wave equation from a 
purely mathematical point of view: namely, the wave equation is the 
simplest form of linear partial differential equation involving a parameter, 
for which the classical process determining asymptotic series solutions 
gives a "multiplier equation" identical with the Hamilton-Jacobi equation, 
while the Cauchy characteristics of the multiplier equation (along which 
the asymptotic "wave packet" is always propagated) then become the 
dynamical trajectories of the corresponding Hamiltonian system. 

Let L(A/ t X) ■= 0 be any linear (ordinary or partial) differential equation 
in ^ with n ^ 1 independent variables x u . . . x n , and involving a parame¬ 
ter X. This parameter is to be thought of as large in absolute value. 
In the case which interests us certain of the coefficients of ^ and its de¬ 
rivatives in the above equation become large as | X | increases. Under 
these circumstances it has been found in many cases that the operation 
of differentiation of ^ or of its derivatives with respect to any variable 
Xi is asymptotically equivalent to multiplication by Xdx./dS, where 5 
is a definite function of X\, ... x n . Thence one is led to an asymptotic 
series solution, 

...) ( 1 ) 

where v 0 , Vi, ... do not involve X. It is with certain general facts con¬ 
cerning this classical, formal process and its relation to the Schrodinger 
wave equation that the present note deals. If we introduce the modified 
differential operators 



340 


MA THEM A TICS: G. D. BIRKHOFF 


Proc. N. A. S. 


bxi 


X 5x, 



the original equation in \f/ may be written as 

/ d"V a w * A „ 

where Z, may be expanded in the form of an infinite series in 1/X, 

£-£o+^+ ••• 

A 


( 2 ) 

(20 


in the case which we are considering. Here Lq may be called the "principal 
part" of Lq, and may be more explicitly written as follows 


LoW 

where we assume 


w+re ,^ + T; 

• 5x, ij 5x,5x, 


(3) 


£«y “ Zjh Zijk = Sikj = Zjki = Sjik = Skij = tkjh etc. 

If now the formal series (1) be substituted in the equation (2), and the 
coefficients of e xs /X* on the left be equated to 0 for k = 0, 1, ..the first 
equation determines 5 and may be written 


where 


p(*i. ... *..|f. 

\ 5xi 5x*/ 


-r-£ + £!. Ef,>|^ + . 

• 5x, ij 5x, &Xj 


(4) 

( 5 ) 


is a polynomial of degree n in 55/ 5xi, ... 55/ 5x„ if n is the order of the 
equation in Conversely any polynomial P determines a corresponding 
Lq. We shall term (4) the "multiplier equation” for L(\f/) = 0. This is 
a differential equation of the first order in 5 which does not contain 5 and 
is in general non-linear. 

When we proceed with the further similar equations for k = 1, 2, .. 
we find that the second equation determines v 0 in ( 1 ), the third determines 
Vi, etc. We shall be interested primarily in the equation for v 0 . Let us 
observe first that we have, to terms in 1/X, 


5 [I V 

a*. 




Vol. 19. 1933 


MATHEMATICS: C. D. BIRKHOFF 


~ /r*s + 1 *1 [*> + i A] „ 0 + i *2 « 4 etc. 

dXjbXj \Ldx, X dx,-J Ldx^- X dX/J X dx,- dX/ / 

Here we have used an obvious operational notation. On substituting 
these expressions in equation (2), the terms in 1/X which involve Vi dis¬ 
appear because of the multiplier equation (4), and there remains the 
following equation for k = 1, 


y. bP bv 0 . l/o _d 

. dy,- dx, 2 V i.i dy, 


= o, 

y, dy, dx, dx,/ 


where we have written y, = dS/dx, for i = 1, ... n. 

Suppose now that we take any complete solution S(xi, . . . x H ; c x , 
... c H -x) + c„ of the multiplier equation. Then Xi, ... x Ht y lt . . . y* 
may be regarded as independent except for the equation of condition 
P b 0. If we write 

= xr = 1 _ n > (?) 

dr dy, 


1 , ... n) 


and let <!> denote the coefficient of v 0 in (G), the equation (G) becomes 

— 0 + <to>o — 0, whence v 0 = Kt~ f ** r . (G') 

dr 

Hence the equation (6) specifies how v 0 varies along any curve x, = x,(r), 
(* — 1, ... «), in Xu ... x* space representing a solution of (7). 

But we have also for any i along such a curve 

dyj = y d«S dxj _ y d»S d-P 

dr > dx, bxj dr , dx, dx, dy y 

Furthermore, from the multiplier equation (4) we infer (in Xi, ... x„ space) 


dP + y dP b*S 
dx, 5 dy, dx, dx. 


so that we deduce 


dr dx,- 


The equations (7), (8) are the 2 n ordinary differential equations in 
Xu • • • x„, yu • • • y»i of Hamiltonian type which define the Cauchy char¬ 
acteristics of the multiplier equation (4), provided we restrict attention to 
those solutions (involving 2« — 1 arbitrary constants) for which P = 0. 

The general significance of the above results is clear: Consider a regular 
transversal surface T of n — 1 dimensions so that, for the given function 
S, the corresponding characteristics intersect T in one and the same sense 


815 



342 


MA THEM A TICS: G. D. BIRKHOFF 


Proc. N. A. S. 


throughout. The value of v 0 may then be given arbitrarily on the trans¬ 
versal surface T, and is then determined elsewhere by integrating along 
these Cauchy characteristics. 

For k j* 0, the equations analogous to (G') arc of the form 


^ + tv, + *„ 

dt 


0 (k = 1,2,3, ...). 


(O') 


where <l>* depends linearly and homogeneously upont»„, v Xt . . . v k -\. Hence 
each coefficient v k is determined up to a constant multiple of by its 
value on T. 

If in particular v 0 , v it . . . are taken as 0 in T except throughout some 
region V of T, and as vanishing along the boundary of this region, to¬ 
gether with their partial derivatives, these functions may be taken to be 
0 except within the corresponding tube of characteristics (on which they 
are determined to the extent indicated above), and vanishing similarly 
on the boundary of the tube. Such a region V changes position as the 
“time" r changes, and the scries solution thus gives rise to what we shall 
term an "asymptotic wave packet." 

Let us call any function v 0 which satisfies the linear partial differential 
equation (G) an “amplitude function," associated with the corresponding 
“phase function" 5. 

If Vo is any assigned amplitude Junction of S whatsoever , then the integral 


/ = / vfrlxidxt .. .dx t 
J * (r> 


(9) 


does not vary with r. Here V(r) is the region into which the arbitrary region 
V at r = 0 is carried along the Cauchy characteristic in "tune'' r. Conversely 
if this integral is independent of r for every V, v 0 will be an amplitude function 
of S. 

To prove this result wc observe that the integral I may be written 


L 


vl(xi, ... x M )Jdxi.. .dx r 


where the transformation from l r to V/t is x, = /,(*i, .. . x n ), (t = 1, ... n), 
and J is the functional determinant d(xi, ... x H )/<)(xi, ... x„). Hence 
the derivative of / at r = 0 (when x, = x,) is 

f[V 2 tM +v ^] V0,lx ' - dx - 


But for At small we have 


h = x. + At + ... 

dv, 


(i = U ... n) t 


and, in consequence. 



Vol. 19. 1933 


343 


1 + 


MA THEM A TICS: C. D. B IRK HOFF 

- — 5*P C>*S 


(? 


+ e 


0 


whence 


dJ 

dr r-n 


bx, dy, ' u < yy { byj bx, <>*, 
_ d*P . ^ d*P 


Ar + 


+ E 


d*i dy,- ' u <>yi <>Vj & x i 

Hence the integral written above vanishes because of the condition 
(0) upon </„. Consequently dl/dr - 0 for r - 0. and likewise for any r. 
so that I is necessarily constant. Conversely if the integral / is inde¬ 
pendent of T for all regions V. it is clear that v„ must be an amplitude 
function. This completes our proof of the above italicized statement. 
Suppose now that we take n + 1 variables /, *i, ... *»• with 

„ as „/ as os\ 

P ” Vt + l, \ Xu -’ Xnt W • * * bxj 

so that the “multiplier equation” takes the form of the Hamilton-Jacobi 
partial differential equation with energy //. The corresponding "principal 
linear equation” LoW - 0 is then the usual Schrodinger wave equation 


provided we take X 
tonian equations are 


together with 


so that bS/ bt 


dxj 

dr 




2iri b 

Xu • • 

1 

• x m» 

h bx/ '*■ 

- 2 xi/h. 

Furthermore 

dj_ 

_ i 

<(¥) 

dr 

” A. 

dr 

bH 

dyi 

. _ ™ u 

by/ 

dr 

dxi 

and 

r = 

t + const. 


( 10 ) 


*5 _ S3 - _ 22 « - l. ... n) 


(ii) 


I he scnroainger wave equanun z* merejure ---- 

which has the usual Hamilton-Jacobi partial differential equation H as its 
multiplier equation. 

Evidently t = to is a transversal surface T in this special case and the 
n -f- 1 dimensional element of volume is dtdx x .. .dx n . Furthermore we 
may write 

S = c H t + S*(xi, ... x H ; Ci, ... i) + c n +i 

since y\ — bS/bl appears only linearly. Here the Cauchy characteristics 
become the dynamical trajectories of the corresponding Hamiltonian 
system (11). 


817 




344 


MA THEM A TICS: C. D. BIRKHOFF 


Proc. N. A. S. 


If we write to a first approximation 

p ~ e^Vo, \j/ ~ e ~ xs v 0 

(Vo real, X pure imaginary), we see that along any part of an asymptotic 
wave packet, the integral J'\fopdtdx l .. .dx H reduces to J'v 0 l2 dtdx l .. .dx n 
which by our general result remains constant. It follows then that 
S'H'dLxi . . . dx n also remains constant. To establish this we begin with 
the n+ 1 dimensional element of volume, taking limits / 0 and /, for t with 
h — to = At, and then let At tend to zero. Hence we arrive at the follow¬ 
ing result: 

In the special case of the Schrodinger wave equation in \p, the “asymptotic 
wave packets" follow the corresponding dynamical trajectories , while the 
squared amplitude integral J'yppdxi .. . dx H remains constant over any part 
of the packet, at least to the order of terms in 1/X. 6 

In conclusion, it is desirable to note what happens under any change 
of independent variables in the general case 

xi = ... x H ) ($ - 1, ... n) t 

It is to be observed first of all that because of the identities 

d hl M = Sty d Cll “ d (al M _ „ dxk dx/ d li, u 1 _ d*x k d [ll u 
dxi j dxi dxj ’ dx,dxj kll dx f dxj dx k dx/ X * dx { dxj dx k ’ etC ’’ 
the components of 

L — L 0 + - L\ + ... 

do not remain individually invariant in the equation (1), and in particular 
the principle part L 0 is not carried over into the new principal part by 
the ordinary rules. In fact the coefficients $ in the principal part transform 
by the rules valid for the attached Hamilton-Jacobi equation. Hence 
Schrodinger's wave equation in the form (10) is only maintained (in 
general) under a linear transformation of the independent variables. 

This fact indicates that any coordinate system from which we start is to 
be regarded as a privileged absolute system of reference for the Schrodinger 
wave equation , up to an arbitrary linear transformation. 

1 L. de Broglie and L. Brillouin, Selected Papers in Wave Mechanics, 19-54 (translation, 
London, 1928). 

* E. Schrodinger, Collected Papers on Wave Mechanics, 1-30 (translation. London, 
1928). 

* P. L. M. Dirac. Quantum Mechanics, 119-123 (Oxford. 1930). 

4 See my paper. Trans. Am. Math. Soc.. 9, 219-231 (1908). 

‘ Dirac asserts incorrectly (loc. cit., p. 121) that the amplitude is constant along a 
dynamical trajectory. He overlooks the terms in the second partial derivatives in 
the equation corresponding to (G) in the special case. 


818 



Reprinted from Proc. Nat. Acad. Sciences, April, 1933, Vol. 19, p.475. 


A CORRECTION 
By George D. Blrkhoff 

In my paper "Some Remarks Concerning Sohrddinger 1 s 
Wave Equation" (these Proceedings, March 1933) the paren¬ 
thesis In equation (6) requires an additional torm 2 Q 
where Q Is related to Just as P Is to L q . The Invariant 
Integral (9) should be written: 

I a S Gv 2 dx_ ••• dx 
-V t) 0 1 n 

whore G Is any solution of the linear partial differential 
equation In 0 for the given phase function S: 




-QG 3 o 


The proof Is only modlfiod In detail. In the special case 
of the Schrtidlnger equation, G = 1 furnishes a solution 
while Q 5 • . 


819 



Reprinted from Annali di Naten. Para ed Applicata, 1933, s. 4, 
Vol. 12, pp. 117-133. 


On the Periodic Motions Near a given Periodic Motion 

of a Dynamical System. 


By O. I). Biiikiiokk mill l>. < . Lkwix Jr. |(*iiniliriil|'<- (Mmhx.) U. S. A.). 


§ 1. Introduction. — In n recent note (Coinptcs ItenduH, 1921), Bihkhokp 
proved a 2>i-dimensional generalization of a simple special case of PoiNCAHfc’K 
two-dimensional geometric theorem. It was there suggested how this theorem 
might he useful in establishing the existence of infinitely many periodic 
motions (of a dynamical system with fixed energy constant) in the neighborhood 
of a given periodic motion of general stable type. This application is carried 
out for the first time in the present paper. A summary of the necessary 
preliminaries is also given. 

Suppose we have a dynamical system with n -f- 1 degrees of freedom 
and a given periodic motion of general stable type. By a change of variables 
and a reduction of the order of the system with the help of the energy 
integral and the elimination of the time, the system can be written in the 
Hamiltonian form. 


( 1 . 1 ) 


dXj _ dH dy ( _ iH 

dt — dt — &c,' 


i=l,2, 


n. 


where H is an analytic function of x, , y % , x t , y t ,... x H , y n and t , and 
admits the period 2a in /. The periodic motion appears as a « generalized 
equilibrium » point, x, = y t — x t — y t = ... =x n ^y n — 0, and any further 
periodic solutions of (1.1), near this equilibrium point and having a period 
which is an integral multiple of 2a, correspond to periodic motions in the 
original (2» 2 )th order system near the given periodic motion. 

Let 

#1 = Vi*' x ff y»*»••• X n0* 2 /n* * 0 • 1 O 

. i — i, ... n, 

yt —(/.(*!#• y (•* x f* x n *» yn* > •) 


be the solution of (1.1) which takes on the initial values. x l0 , y, 9 ,... x„,, 
for t = 0, and let 

x ii — fi[ x io« y,,»••• X n* > y»«•» 2a) 
y., =y.(*.• » y..»~ y*.» 2a). 


820 



118 


G. D. BiRKHOFP and D. C. Lewis: Oh the Periodic Motions Near a given ^ 


These equations define a transformation T of the neighborhood of the origin 
into itself, and evidently there is a one-to-one correspondence between the 
periodic solutions of period 2mi and the points that are invariant under I , 
the mth iterate of T. The following method for detecting these invariant 
points was given by BiRKHOFP as a generalization of Poincahk s geometric 

theorem: 

Let *, m . U . represent the point into which the point 

*, 0 . y,o » ••• y H . is carried by 7’"*. On account of the well known relative 

integral invariants of (1.1), it is seen that 


1=1 


is an exact differential. Changing the variables to the modified polar coor¬ 
dinates. = and tan-*(!/</*<). we find that 

dJ = i 

1=1 

Now suppose tlml we are able lo find n mniiifold difin.d b.v the eiiUHtiuii*. 

|1.2| i,„ = B.I#,..e.<*=1. (B, ni.nl.vtli-, periodic) 

surli that along this manifold always differ* from B,„ by some integral 

multiple of 2n. i. e. B, m - 0,. = 2A-.a. Then wo have and hence 

<f J — - l«... — 

ial 

along the manifold. Integrating, we get J ns a single valued function of 
(x ~y .... x,., »/,„), unique save for an additive constant, defined over the 
manifold. Considered as a function of the 0.* s, it must therefore be periodic 
and must have at least 2" critical points ('). But any critical point of J on the 
manifold is obviously invariant under* T m , since dJ=0 implies that »<«,„ = n l0 . 
while we already know that for the point in question 0, m = 0, o H- 2A*,n. 

The existence of periodic motions therefore depends upon the existence 


(•) A critical point is a point for which dJ = 0: those a if* to bo counted with their 
proper multiplicity. The existence of two critical points — maximum an.l minimum — is 
obvious. An easy method of establishing the existence of 2" — 2 other'critical points is to 
apply M. Morse’S critical point relations (see. for instance, his paper, Relations betneen the 
Critical Points of a Peal Function of n Peal Variables. • Trans. Am. Math. Soc. «. vol. Hi 
(1925). pp. 845-830) to the n dimensional torus for which the connectivity numbers {mod 2) 
are the binomial coefficients. 


821 



Periodic Motion of a Dynamical System 


til) 


of manifolds of the type (1.2). To prove that these manifolds really do exist, 
we make use of a preliminary normalization of equations (l.l) f|. In terms 
of conjugate imaginary variables p ,, p„, q„, the transformation T can 

be written in the form 


P„ = Pf* 1 ~ q.) 

v " q,) ** r .(P.. n.i 


The <l\|p„, q 9 ) and 4Mp 0 , q,\ are convergent power series in p,.,be¬ 
ginning with terms of degree 2|&+1, where p is arbitrarily large. The M,\pq) 
are polynomials with real coefficients of degree |i at most in the n products p,q t , 

M 

p,q,,- p„q„. Setting p.q.-n., we accordingly write 3/,(u) = + 2 c 0 Uj + .... 

The significance of the fact that we are dealing with a given periodic motion 
of general stable type is that there are no homogeneous linear relations with 
integral coefficients (not all zero) connecting the <|* ( and 2n, and that the 
determinant \c tJ \ is not zero. We shall regularly denote by c ,J the cofactor 
of c (J divided by the determinant itself, so that 


Zcj M cl * = $c kJ c* i = o JI 


We now change back to real coordinates, x, 




Pi—Ji 

2 V— 1 ‘ 


It is 


to be remembered that these changes in coordinates do not destroy the Ha¬ 
miltonian form of equations (1.1). The transformation T now appears in the 


form, 


x (| = x,„ cos 9 , — y i9 sin 9 . -4- X,[x 9t y,) 
y, 0 = *»o sin 9 , -4- y t9 cos 9 , -+- Y,[x ,, y .|. 


where, for abbreviation, we have set iV ( (x,*- 4 - 1 /,*) = 9 ,. The X,|x, y) and V,(x. y) 
are real convergent power series in x,, y ,, x # , ff t ,... x n . y n beginning with 
terras of degree 2p - 4 - 1. Finally on introducing modified polar coordinates, 
Hf = x, : -h yf, 0, = tan-'lz/./x,), the transformation T takes the form 


m«, = **«• ♦- r,(n # , 0.) 

0 H =0, # -l- «,(»!,, 0.). 


|'| Cf. (». D. Bihkiiofk. l)ymnnicul Systems. Chapter III. particulnrly § 0. Also 
Chapter VI. § I. 


822 




120 


G. D. Birkiiofk Mild I). C. Lewis: On the Periodic Motions Near a gireu 


The* formal expressions for U t anil B, are readily written down: 


i _» 

f/.jiio, ft # ) = 2 X t u,? cos +• 9.) 2)>„" 8'n (#.o ■+■ 9 .) x * + Y * 

- X. sin (0,„ -4- f.l ± V. cos (0,- n -f- <p<)_ 

V.O'o > V = nrctan —,-- 

n„ 5 + X, cos (0,, ft) -4- Y, sin |0„ -4- <P.) 


2 J i 

U t \a, 0) may be represented as a convergent power series in u t *, U t 2 ,... • 
with coefficients which are analytic periodic functions of 0,. 0,,... 0„ of 
period 2a. It begins with terms of degree 2p -»- 2 in the V u' 8. The expres¬ 
sion for B,(m, «| is not so simple and will be discussed later. 


$ 2. Some Fundamental Inequalities. — Let it bo understood once and 
for all that the capital letter A, followed perhaps by a subscript, is used 
throughout this paper to denote a suitably chosen positive number, inde¬ 
pendent of u,, 0,,... u„, 0„. Thus, for example, we know from the power 

2 1 2 

series development of f/,(ii. •») in powers of u, 1 , u t * ... u H * that 


\U t {u. »)\£A 




A t n* 






provided that the u s are sufficiently small. Thus, we may write: 


( 2 . 1 ) 


I ••) | £ 



The point in 2 n dimensional space whose modified polar coordinates are 
represented by if,, 0., i«,, 0 : ,... u„, 0„ will be denoted by the symbol (u, 0|. 
Sometimes, when the 0’s are not being emphasized and no confusion is 
likely to result, this same point will be denoted by the more abbreviated 
symbol (n). The « distance » between two such points, (u, 0) and (n\ O') is 

defined as |/ S [u /— u^)*. The « distance » is thus independent of the 0‘s 

and is equal to the ordinary distance between two corresponding points (?<) 
and (u'l in n dimensional space. The distance of (if, 0) from the origin will 
be denoted by £. 

Let a denote n fixed positive number less than 1/2. We shall show that 
the following inequalities hold as long as * is sufficiently small and a 




Periodic Motion of a Dynamical Synteui 


for all pairs of indices, i. j = 1. «: 


1. 2.... 

n: 


1 UA". ei| <-4- 


IB.ln, B)| ^ >1 


1 dU tl 'dUj 

ISA- 


1 3 UiM, 

1 £A 

;,oi 

I 1 £ A 



ISA 



The first of these inequalities follows immediately from (2.1) from the 
fact that (S ltd* < n 2 ft/ = >*;*. In order to prove the second inequality, we 

Jul 

consider briefly the function 0). It is of the form, arc tan —^ , 

— f/(M, “I 

1 l I 

where f and y are convergent power series in m,*, m,*,... m„* with coefficients 
which are analytic periodic functions of 0,, 0,.... ®„ • They begin with terms 

I H 1 |l|l I I 

for u, t ... u„ sufficiently small. Let us temporarily make the definition: 


H i;; n, H) = arctan 


so,ha. H. ( „,9, = e,^; ,, «). 


We have I «,l5: n, 8| | £ A, \ \f{u, »l| S ,on l! *» IVtMII, 

| *y\n. H)|, and the u's are sufficiently small. Therefore 

i «"■ £ - M J.(S)T II “'f- 

But. we are assuming that u,ju,^a. and therefore we get 

i hj«. flii< 

as long as ^ is sufficiently small: here ,4* r= .djii 2 \ a. Similar considerations 

i 

applied to the partial derivatives of B,(5; u. 0). with respect to the */» and 
the 0's. enable us to obtain the appraisals for 3B Jdtij and 3H i /?0 J . The 
appraisals for 3 Ujdiij and dU,/Mj are even easier. 

Annali <t i Valtntadco. S»rl» IV. Tomo XII. W 


824 


122 «. D. B.RK..OKK H»d D. C. tmermi Oh ike Periodic Motions ^r a gtoen _ 

§ 3. A Simplification of the Coordinate System (u, «). — If w<! ,nak ‘ 
the change of variables - +4 the transformation 7'is readily 

aeon (aince^S c h jt» — *«,) to lake the simpler form: 


13.11 


Ml, = M,o +- ®J 

fl,. = 0, o -+- -f- £ c.jW* -4- <*.(«•. «.l» 

> B | 


where in accordance with our later notation the Onah.-a over the new v»- 
rinhles have been omitted. This change of variables is such that 


(3.2) 


u t = />• 1*0: »<* = »«. *M*0* 


Where P,(H) is a polynomial in «,. .<. which lacks constant and linear 

terms, and P,(tc| is a convergent power series beginning with quadratic terms. 

Wc mast show that the inequalities of the previous paragraph still hold 
for these new variables as long as ; is sufficiently small and u.iiij > * for 
all pairs of indices i. j = 1. ->.... »- Here 5 is a fixed positive number less 

than 1/2, and 


>-V, 


Lot x be any number such that 0 - * < a <r Then starting with |H.2| 

it is easily shown that the fact that «,/«» < l/« (for ..II pairs of indices j 
and *| implies that l/». provided that the ,.s are taken sufficiently 

small. It also follows fro... |.S.2| that "»>l lh " derivatives 

a„,/a,7 ( are bounded fos small values of the «’*. This is all that is needed 
to verify the validity of inequalities (2.2| and (2.l| in the new variables. 

Hereafter these new variables will be used exclusively with the dashes 

omitted. 


§ 4. The Behavior of the Image or a Point tender the Iterates of T. ~ 
Let the m' h iterate of T take the point (i«,. «,) into (n,„, h,„l- ,1,iK !> H * 

ragraph we prove two fundamental theorems about the behavior of (t< m | for 
large values of m and small values of the l*' 8. 

Theorem I. — If (u 0 , H.» is at a sufficiently small distance from the 
origin , then the distance ; m . of Hu.. » ro l from the origin dors not exceed nZ 0 
as long as m ^ A lo s,-i‘ In ^ 2). 


825 



Periodic Motion of a Dynamical System 


123 


s+1 


Proof: From (3.1) and |2.1) we have 


| Aw , m | = | u lrn +,- u tm | ^ A t -yXu Jm 
Hence u im increases less rapidly with m than as if = jf 


- Mjm 
l=i 


ini 


It follows that £ u Jm can not increase to \ n times its initial value for m = 0 

>-i 

until 

Hence, as long ns m g j 4 we have 

£ <I £ "jm\& ” I £ nJs «* £ 

>=j u-i I u-i J >=> 

wince i. e. as long as m q. e. d. 

Theorem II. — If the point (uj is such that u j0 /«ko^2a for all ordered 
pairs of indices j and k, then u Jm u kB , ^ a as lony as in £ A 14 *s 0 ”»\ provided 
that C, is sufficiently small. 

Proof: As in the proof of theorem I. we have 

U«i 1 

As long as w /I we have from theorem I, Aw, m < -/l, t C 0 '* 1 '• Hence 

y £ Au im * < = d, which is not less than the greatest distance the 

r i*il 

point (m,„) can move at each application of the transformation T. 

Let Uj 9 'u k , — 2$ Jk ^2a. Also let X ( = u, # /v«. that the are the 
« direction cosines » of the ray from the origin through (u s ). We have 

X./X, ;> 2*. Hence nX. 5 > 4<x* £ X/ = 4a’. Therefore X, ^ 

*-i \n 

We consider some other point (n). which for a certain pair of indices 
j and A*, is such that Uj/n M = a. The distance between the point (u,) (regarded 
ns fixed) and the point |i<) [regarded as variable subject to the condition 

Uj — %u h •— 0) is given by 1/ S(w,— tf,*)*, the minimum value, D, of which 

r * =i 

is found by elementary methods to be 


j) — u J n T g,l »« _ ~ -> _ gX„;, 


2a* 


VI 


VI 


VI 


VI 


Vw(H-« ? ) 


= Co = ^-S.. 


826 


124 


G. D. Birkhopf an*l D. C. Lewis: Om the Periotlic Motions Near a yn ru _ 


This distance cannot be traversed by the point (« m | upon successive iterations 
It T until -less perhaps m first becomes grower 

than vl, 0 C<r>N at which P° int lhe n<?ce88ar y information from Theorem I 

no longor be forthcoming. . 

Hence the theorem is true as stated, if we denote by A ti e es 

the two numbers A to and [AJA„). . ^ 

Let the region lift, a) denote the collection of points for which s - 'H 

and UjMtaZ* for n!l P air8 of indice8 i and ** TheoremB 1 a,,d 11 fl '° W 
that if |tt 0 , »,.) is a point of J?(t), 2 a), then the image point [u m , »„.) undor I 

must lie within Rfa, «) as long as provided that C. » ef¬ 

ficiently small. 


§ 5. The Non-Vanishing Property of the Jacobian. — We now proceed 

to prove . 

Theorem III. — If K is any positive number and if is a sufficiently 

small positive number, then for (u 0 , 6.) in Rfr, 2a» the derivative ^ differs 

from mcij by a quantity which tends to zero with C 0 , a* long as m does not 

exceed KC 0 ”* + \ This tendency to zero is uniform with respect to m. 

Proof: We introduce the notation = v ik [m), = iv tk {m). [m, k) 

will be used as a symbol to denote any linear homogeneous function of 
t> ,*(»>*),... v nk [m), w, *<m), »,*(*»).... whose coefficients, depending 

upon m and (it,, 0 O ), are infinitesimals of at least the (|i — 1)"' order in Co 
for m&Kir*' 1 un *f°rmly for (««,, 0.) in Riq, 2a). The sum of any definite 
number N of the symbols (m, Ar) is another symbol [m, Ac], N being assumed 
independent of m or Co* L* 1 


(5.1) 


a 


•* 


»£!. _ c 

du k .- Clk 


v.a(1) = <*,*. 


ae, 

^«A0 

t>.*(0) = o. 


<U 

>*>«a( 0 ) = $,* 


Now by the elementary rules for partial differentiation we find 

«-.(*- + » =|, «*<«•»+!. 

W( ,|m + 1) = S -4-2 

>=1 1=1 cw Jnt 


827 



Perimlie Mol ion of a Ihptaniical Si/xtcm 



But 

3*W. * , U . 30,(U m , « w ) 

, Bu Jm ° ,J 3m J m , * 

3ll, m> , _ *VA*m • ®mj 3IW, _ . 6 m ) 

d^jnt Bitj„, i 3 

We now use the inequality's |2.2| with reference to the point (tt H ,, 0,„). 
These inequalities are here applicable, because from Theorem II u Jm ltt h „,^a 
for all pairs of indices j and A, m being restricted in such a way that 
Remembering from Theorem I that and introducing 

the symbols |w. A|. we therefore get from (5.2) 

l I. t>,*(Mi 4- 1) = t\*(Mi) 4- 2 -t- (in. A-| 

( 53 ) > ml 

' II. ir.Jm h- 1| = ir.jni| 4- (m, k |. 

We proceed to show how the ir s can be eliminated from equations (5.3). 
Replacing in by in H- 1 in equations (5.3)1., we have 


»\Jm 4- 2) = f.Jiii 4- II I- 2 c„iv Jk lin -f- 1| 4- (mi 4- 1, A|. 

i=» 

Subtracting I from this, we get after transposing. 

A’Mmi) = i\Jmi i- 2| — 2r,J»»i I 1) 4- v, k lm) = 

= 2c tJ |ir > j l (m 4- I| — ii^ilMill 4- (mi 4- 1. A | 4 - (in, A|. 

We eliminate the ir Jk {in I) from these equations with the help of (5.3)11 
and thus obtain 

(5.4) A'r.jjii) = (m 4 -I. A| 4 - ( 111 . A|, 

where now the ir IA (m - 4 - l) have already been eliminated from the symbol 
(mi 4 - 1. A). We now solve equations (5.3)1 for the u* lA (m) in terms of the 
r (A |Mi| and r,*|m 4 - 1). We can clearly do this, since the determinant of the 
coefficients of the unknowns is precisely the non-zero determinant | C %J | plus 
an infinitesimal in of order p — 1 at least. Substituting the resulting 
expressions for the ir, A (in) into the right members (5.4), we see that the 
required elimination has been completely effected. 

For convenience, let us now introduce the functions y, k [m) and the 
numbers b, k as follows: 

J- -J | b»h — a lh and y. k {m) =v ih {m), if a,„ ;> 0, 

I &.* = — «.* and i/,*(mi) = — v ik {m) if a, k < 0. 


828 



J .,,5 0> D . BiRKHOPP and D. C. Lewis: On the Periodic Motions Near a given 


We first wish to find out how Urge the *.*(«*» can become while m is 

restricted by the inequality £ 2KC."’*'• Evidently, on account of (5.4|, 
the | j/.j, | can not increase as rapidly as they would if 


A'ftaM = 2 l M» 

» j=i i 


where for abbreviation P = A solution (unique for integral values 

of ml of this system of difference equations, under the initial conditions. 

,5.6) !/.a(0) = 0, i/,*(1| = 6.*, 

is readily found to be 


— m 


1/». \fa -t-g+v it+f )"■ - (i i- f - v -if + p 1 ) 

(5.7) = + 2 V27T7 

= 6,,»«-*--( S »,*V _3Q ( w P i - <**)• 

«\/«l / 

I 1 

where Q is a convergent power series in powers of mp- and p*. which lacks 
constant and linear terms. 

Let us see how (5.7) behaves as we let (and consequently p) approach 
zero and allow m to take on values in the range. 


t-H 

y 

—Ct. 3 


0 < m ^ 2 KS. 

(5.8) i. e. 0 < m ^ A„t~ ‘ ’ ° = 2/CCT 1 ' \ 'vliere «o = ^ Vf, ^ 0< 

We have from (5.5), (5.1). and (2.2): 

(5.9) \b, k — |c,a||<> 11 c,» | = absolute value of c,„ and not the determinant) 

provided that C. is sufficiently small. Here we increase the number A lA 
defining p, if necessary. From the power series development of Q. we have 

! p*)| ^ aJihp* I- p*) = A, „(in ? p I- 2mp »- p). 

ns long as both nip* and p 5 are taken sufficiently small. On account of (5.8) 
this requirement is surely fulfilled, if s# »» sufficiently small: therefore 

| (T ' o(iMP*. P*) i £ A ,,(u»’p 3 +- 2wp* -4- p«)s -J.,U,,V“ -4- 2/1 „P 1+ " -I- P*) 

by (5.8). Hence, since i b JM is bounded, we see that the second term in the 


829 



Pet-iodic Motion of a Dynamical System 


127 


right hand member of (5.7) tends to zero with p and (5.7) and (5.9) thus 
yield the result that 

ITmM — I | m = t,»(£©, in) 


whore limi(;,, mi) = 0 uniformly in in, for the range (§.8). 

Now lot us see how rapidly the slopes of the y ik [w) can decrease, while 
we limit m to lie in the range, 

(5.10) 0 < m< A n p~ 1 = 2K£ 0 4+l — 1. 

From the above results, it follows that the right hand members of (5.4) can 

a 

not exceed, in absolute value, an expression of the form A n p* . Hence 
the y ih . can not decrease as rapidly as they would if they satisfied the dif¬ 
ference relation A’^mi) = — A tt p*+'*. But. using the initial conditions (5.8), 
this yiolds i/ ( *(mi) = b ik m — A lv p*^* mi(»ii — 1). Also 


-4„ P l + J £ 1n{tn ” ** 




, -I- v 

w'p* —mp* 


- 2 








which tends to zero with p. Hence, we again obtain the result that f/udm) 
differs from |c u |m by an infinitesimal in Thus the true value of y, k {m) 
must also have this property. That is, from (5.5), 

v, k {ni) m m ± y, k {w) = C, k tn- f- i,*(C. • «*). 
cn k . 


where i lA (v 0 , mi) represents an infinitesimal in uniformly with respect to mi, 
as long as mi lies on the range (5.10). It remains only to note that mi will 
surely lie on the range (5.10), if it satisfies the inequalities 


provided, as always, that sufficiently small. 

i ^||| n 0 0 ) 

Incidentally the theorem just proved shows us that — " n * —-^r 

differs from the determinant |c tJ | by an infinitesimal in an( * hence can 
not vanish for s* sufficiently small. 


§ 6. A Simple Special Case. — We consider here the degenerate case 
where, in the transformation (3.1), defining T, the Ui and the 0< are identi- 


830 



128 <1. D. BiHKHOPF h iid D. C. Lewis: On the Periodic Motions .Wear a given 

cally zero. In this case the m" iterate of T becomes simply: 

i u.„, = = Ui 


(6.1) 


j 0, m = e,. # -t- m»4*i -+- mZCijUj. 


We shall use a theorem proved by BorelH in the consideration of the 
approximation of irrational numbers by continued fractions. An immediate 
corollary of Borel’S theorem is the following 

Lemma. — Corresponding to any positive number y not k * HB ,han 11 
and to any real number p (rational or irrational), two integers p and q can 

always be found such that p — g|<vB? "" d T = 9 ^ 15y *‘I P § ° I ‘ 

We consider a positive number, ‘ rj, which we regard as fixed but as 

- - _ _ 

having been chosen sufficiently small in advance. Let »!, — »«,=- M » — y- * 

Let v t = £ Ci/U,. Then 
/=» 

( 6 . 2 ) 


it » = S c ,A c ( 
>=» 


Lot v be a fixed positive nurabor ^ 1. Then, if >) is sufficiently small, we 
have vp w ^ll. Using the lemma, we choose integers m< and kf such that 

k l _ I < -L ‘ and T)— £ ,.i, S lor,-*-. 

Mi 2 r | \ 5 m « 

Write m = m, • m, • m, ... m„ so that rj-"* < m ^ 15"Tj- /nv . And let 
Af| = »n, • m, ... »!»*_, • hi • >w 44 ., • ... • Then the above inequalities yield 

1 2 *i!? ^ __ Vl I < ^ _L ^ i) 5 **. In other words — <|»i = v* ■+“ e ‘ * " here 

Now. using the above definitions for the integers If,, If,.— and ,M » 
define u t , u,,... u„ by means of the linear equations. 2k t n = »n4»* -1- w ^fo u J • 
and hence 5 = t>< + i*. Solving these equations for the »i 4 . we get with 

the help of (6.2) 

t« A = £ c'*(t> 4 -f- e 4 ) = ii* -4- e*, where I e * I = 1 c '* e< j = • 

i-i l ,= * 


(■) Emii.e Borel, Lefons sur la thioric de la croissance. p. 149. 


831 



Periodic Motion of a Dynamical System 


The distance of the point (i«) from the point (u) is 


|/s(u» - «,)' = l/ si,' S A„r,‘ 


On the other hand the distance of the point |t<) from any of the manifolds. 

|1 ii. — 2a ii . I — 2a rj 

“• - 2au t = 0 [•«?,, i- found .o be ^ + ~ ^ + 4l V, ’ 2' H "" C "' 

if rj is sufficiently small, the point (u| will lie well within the region 
Ii(r], 2a), its distance from the boundary exceeding A„-rj. 

In order to conform with the notation of the rest of the paper we 

replace v by (p^8iH-4) and u, by if,/. We collect the results of 

this paragraph in the following ^ 

Theorem IV. — Let the positive numbers o^< and p(> 8n h-4) be 

chosen in advance and then held fast. Then it is jtossible to choose the positive 
number rj so small that one can always find integers, k,. k,,... k„, and m, 
dependent upon rj and having the following tiro properties: 

[JC = 15”|. 

tt 

2. The solution of the linear equations 2M = in<J*i in S c y u Jo ', yields 
a point (u/| lying within the region R(t). 2a|, its distance from the boundary 
exceeding 

Here A ti is a suitably chosen positive number, dependent upon n, c IJt |i, 
and a, but independent of r,. 

$ 7. The General Case. — We need the following elementary lemma: 
Lemma. — Let f,iu,, u tt ... t ) [i = 1, 2,... ii) be defined for O^f^l 

and for [u) in some closed ft dimensional region S. Let all the fiu iy u„,0) 

vanish together for one and only one set of values for the it s; viz. U| = it/. 
Let K denote the shortest distance from (it') to the boundary of 5. Suppose 
that fi is of class C" and that the Jacobian, 

j 2\f i* ft*‘” /»») 

5im, > *t f »— ».r 

is nowhere zero for |n) in 5 and Let gi,[ti lt u tt ... u„, tj represent 

the cofactor of dftfdttj in J divided by J itself, so that 

Annuli d* Maltmaltea. Strl* IV. Tomo XII. 


832 




,80 O. D. Birk.ioff an<l D. C. Lewis: Oh the Periodic MoMearagioeH 

Finally denote by M " n u PP er bo,,nd ,or th ® funu,ionH ' 

- df t 

F„(u, t /)= — y t . 


Thu I is. for l«) |n S 

Then there exists n unique set of functions — ••Jt), o e ass^ . 

defined on some, interval, 0 £ t ^ t,. such that /4".Hh *M'>— «U<I, 'I = ”• 

Furthermore ,.,(0| = «nd l, is the lessor of tl.e two "umbers 1 and 

This lemma is for our purposes more advantageous than the usual 
. implicit function theorem because it gives us a definite appra.sal for 
the interval on which the functions «..(f| an- defined. We give a 

brief indication of the proof: Necessary and sufficient conditions on a set of 
functions, «..(0. that all the «•*.(&- *.M. 'I • «• » re ,h “' 


*Ii a S ^ + *-k ® 0 and Al»,(0),... 0| = 0. 

(It imi *Uj dl dt 

Solving for the derivatives of tho «#* wo get a system of differential equa- 

tions, — = Fi(u,, u,,... M„, t), in tho standard form. Those equations are to 

be solved under the initial conditions N< (0) = n,\ The lemma now follows 
from the known existence theorems for systems of ordinary differential 

011 Wo consider values for tj, not only sufficiently small for the validity of 
theorem IV, but also so small that all the results of the preceding paragraphs 
hold if grj. We forthwith choose a set of values for k ,, k 1t ... k„, and m, 
depending on rj and satisfying the conditions of theorem IV, 


(7.1) 




Now it is easily proved by induction that T m may be written in the form 

^ U im = Ui 9 -f- 

| 0 <m = -h M«|'« +■ m -+- ©*m(K,» fl .) 


(7.2) 


•—i 


where h __ 

(7.3) , •».! ="s‘|i« — 1 - vl| Xc h U,[u., ».)| -+-^Bi(u.. ».l- 

We wish to show that the equations, 8 im — 8„ = 2fe|ic can be solved for 
the it,., n,,,... ti. l0 in terms of the 8,.. 6 no . For. this purpose we regard 



Periodic Motion of a Dynamical System 


the 0's as fixed and try to solve the equations, 

(7.4) . u„J m — ^ H- <h I- 2 Ciju j9 -t- ^ « 4m (u., 0.) = 0, 

for the i« # ’8. Since m«t> 4 differs from 0 4ni by a constant, 2A’ 4 nH-0 4o , it appears 

that = i-?——, which, according to theorem III, differs from c 4 , by an 

du JO m ?Hj„ 

infinitesimal in uniformly with respect to m, in being required not to 

exceed KC.” 4+ t> 4 *')• Th ' 8 mean)1 ,hal tendR lo zt * ro 

with y), uniformly with respect to in, and, hence, in particular, if in is de¬ 
termined as in theorem IV. We assume always that (»«J lies in 2a). 

It will be convenient at this stage to introduce a parameter t and con¬ 
sider the equations. 


fii't .. 0 — ~ 


H- S CyUj, I- , 0.) = 0. 

i'l 


We allow t to assume values on the interval 0<<^1, and we shall try to 
solve (7.5) for the i«,’s ns functions of t. Evidently, if we can do this, all 
we need to do is to set t = 1 to get the required solution of (7.4). 

We try to apply the lemma. In the first place 


it_ 




which differs from c 4 > by an infinitesimal in t). Hence, if we denote by # 4/ the 

cofactor of in * fty /»* divided by the Jacobian itself, we 

du, 9 3(1#,., «m.I 

see that g ti differs from c tJ by an infinitesimal in tj. Hence \tJij\<A u . 
Also it follows from (7.3), (2.2). (7.1), and from the elementary fact that 

H1-W I 

2 (m — 1 — v) = - i;i|m — 1) that 

»'d * 


ISI-U— 


Hence, we find that 


| - = W. jl | S nA„A„n' r **= A.rf*** 


which is the 3/ of the lemma. 

In the second place, we know from theorem IV that for f = 0 equa¬ 
tions (7.5) have one and only one solution, m' i0 , »i' f0 ,... h'„ 0 , which lies well 


834 



■ 32 G. D. Birkhoff and D. C. Lewis : On the Periodic Motion, Near a given 


within the region 2.). For the region S of the lemma, we shall take a 

certain 1 dimensional sphere together with its interior. The cental ’ ot 
i8 to be the point (tt‘) and the radius is to be A „>)'. By theorem 
whole of S lies within J*,. 2a). The . K . of the lemma is therefore equal 

l ° ^Hence n unique Bet of solving functions «„(/). exists the 

interval of definition being 0£l£f.. "here t, is the lesser of the two 

numbers A »" d '• Hence. if *1 is taken sufficiently small, f, = 1, and 

„ ,1). 1). ••• .<...(1) satisfy 17.41. as required. The solution (...) thus obtained, 

also has the property that its distance from («. | does not exceed A„r, . 
Hence ?, exceeds an infinitesimal of the first order in r). 

For a fixed sufficiently small value of and with a corresponding fixed 
choice for the k, and in. the solution (n.l is unique, at least so far as tile 
reeion /fin, 2a) is concerned. For suppose there were a second solution (...) 
corresponding to each element of an infinite sequence of tending to zero. 
We can join the two points (...) and (n.l will, a straight line segment, 
which lies wholly within //|r,. 2.) and whose direction cosines we denote 
by X,. X t .... X„. Since 0, m Iiiih Hit* same value (vis. #, < 0 2A«it| at both end* 

of the segment, its directional derivative. S ?*[” kj, must vanish at some 

intermediate point P,. We know, by theorem III. that 

- V I X, = S e„X, -t-t,(/\ T)l 
\r i=* 

where x t (P . r,| tends to zero with r), uniformly as lo P or the X*s. Hence 


S Sc 4 / X, x t [P i% -0) 

4.1 >-• 


=°;.“4 


an infinitesimal in t), where, of course, the X 4 depend upon yj. Let A,. 
A,,... A„ be the set of values for the X, which, under the condition SX,= = 1 

makes )’« minimum. Then we infer that j “ 1 c b A .) = "" 

infinitesimal' in r,. But. since the A, are independent of r,. this implies that 

£ Cl ,Aj = 0 for A. not all zero: and thus we obtain a contradiction of the 

hypothesis that the determinant of the c„ is different from zero. 

The obtained unique values for «... «.... which satisfy (<.4| may 



Periodic Motion of a Dynamical System 


183 


be regarded as single valued functions, 8 t0 »*« of t,ie V s - They 

are obviously periodic of period 2a, since the <I>4 are also. The implicit 
function theorem, furthermore, shows that they are analytic in the neighborhood 
of any point (0, o , 0„ o ). We have thus proved the following theorem, 

which is the main result of this paper: 

Theorem V. — It is possible to find a manifold in the space of the 2n 

1 

variables, u,„. H. u n0 , 0„o (or the original variables x, 0 = u 10 * cos 0 IO , 

i 

y,o = «,.*Hin B.ol defined by equations of the type, 

u u = /*.(&.,, '',.»••• *= 1. 2,... n, 

along which , for a suitable choice of the integer in, the », m differ from the 0 I0 
by integral multiples of 2a. The B, are analytic single-valued non-vanishing 
periodic functions of period 2a in the »,*s. It is assumed that ji (which ap¬ 
pears in the equations defining T| is not less than than 8n 4. 

Furthermore this manifold may be taken in such a way that, given a 

positive number a < *. the u u satisfy the following relations: 

1. “!? ^ 2a for all pairs of indices i, j = 1. 2,... n. 

2. S u Jn *[■§ s.’l arbitrarily small. 

3. O^m ^ 4> \ where K = 15°. 

Item 2, of course, implies that there are an infinite number of manifolds 
of the type described in the theorem. 


856 



Reprinted from Bull. Amer. Math. Soc., October 1933, Vol. 39, 
pp. 681-700. 


QUANTUM MECHANICS AND ASYMPTOTIC SERIESt 

BY G. D. BIRKHOFF 


Part I. 

1. Introduction. In its bold primary outline the program of 
quantum mechanics in the Schrodinger form runs as follows: 

(A) Set up the Hamiltonian equations of the atomic system 
(nucleus4-electrons) on a classical basis: 

dxi dH dyj ^ _ dH (i — l • • • n), 

• dt dyi' dt dxi 

where x it y< are ordinary rectangular coordinates and momenta, 
and H(x u • • • , *»; yi, • • • . y*) the total energy. The 
associated Hamilton-Jacobi partial differential equation is then 


as 

dt 


( dS 



0 . 


(B) Write down the corresponding homogeneous linear par¬ 
tial differential equation (the Schrodinger wave equation) : 


\ dt \ 






o, 


where the operational symbols in II appear on the right hand 
side of the individual terms of H, and where X —2ir i/h t if h 
is Planck’s constant. 

(C) Write 

t — e- XB ty*(xi, -, x n ), 


thus obtaining the linear differential equation (written in opera¬ 
tional form) 

{H - E)+* - 0, 


and determine the characteristic solutions, ^ 2 *, • • • , for 

which the vanish at infinity in such wise that 

t An address delivered at Chicago, June 20, 1933, before the Society and 
Section A of the American Association for the Advancement of Science. In 
connection with the first part of this lecture, see two notes in the Proceedings 
of the National Academy of Sciences for March and April, 1933. 


681 

837 


682 


G. D. BIRKHOFF 


lOctober, 



is finite. The corresponding E u E 7 , • • • then prescribe the 
possible “energy levels,” so that the possible “spectral fre¬ 
quencies,” I'm,, are those given by the formula 

hv mn - E m - E n , (E m > E n ) t 


in accordance with the Planck-Einstein law. 


Thus the program begins by relating the physical problem 
to a special linear boundary value problem of classical type. 
In its further development the aim is to obtain a complete 
account of atomic properties which is in accord with this 
starting point. While there has been extraordinary progress, 
there can be little doubt that complete success of the program 
is hardly to be hoped for. 

My purpose today is to lay before you a tentative answer to 
two important mathematical questions raised by the primary 
program itself. 

Firstly, what is the mathematical significance of the Schro- 
dinger wave equation in its relation to the Hamiltonian equa¬ 
tions? My answer will be given in terms of classic formal proc¬ 
esses connected with asymptotic series. It is true that the 
theoretical physicist has obtained a kind of “deduction” of the 
wave equation on the basis of the analogy between the wave 
theory of light and the elementary optical theory (wave and 
particle theory). But I hope to bring out more clearly the true 
inwardness of the Schrodinger wave equation as a purely 
mathematical entity.t 

Secondly, the form of the wave equation which arises in 
practice is usually reducible by means of separation of variables 
to an ordinary differential equation essentially of the following 


type: 


d. ? 


871 -bn 
h *~~ 


(£ - V(x))+ 


0 , 


where the function V(x) is defined in a certain interval in which 
E— V(x) changes sign. For the determination of the character- 

t For references to the important earlier work of de Broglie, L. Brillouin, 
Schrddinger, and Dirac, see my notes cited above- 




*933-1 


QUANTUM MECHANICS 


683 


istic numbers and functions, certain asymptotic series have been 
employed by Wentzel, Brillouin and Kramers.t It is, however, 
an open question as to whether or not their methods are 
justifiable. However, in an important recent paper bearing 
directly upon the points at issue, Langert announces that he 
intends later to give a general discussion of this and similar 
questions. My intention here is to outline a simple justification 
of the Wentzel-Brillouin-Kramers method on the basis of a 
slight extension of some earlier results of my own. 

Thus asymptotic series play a central role in what I have to 
say today. I would not be surprised if such series were found 
ultimately to be of importance in other aspects of quantum 
mechanics; for example, in connection with the proper formula¬ 
tion of tfeisenberg’s uncertainty principle. 

2. Linear Equations and Asymptotic Series. Let 

(1) U+, X) - 0 

be any linear homogeneous differential equation in the depend¬ 
ent variable and the independent variables x u x 2 , • • • t x n , 
and involving X, where X is a large parameter. This equation will 
be ordinary or partial according as n= 1 or n > 1. Let us suppose 
that the coefficients of ^ and of its derivatives in X) are 

analytic in x Xt • • • , x n , and X, and expansible in convergent 
power series in 1/X for |X | >A. 

Now under these circumstances it- has frequently been found 
that the differentiation of certain solutions \p t as well as of 
their various derivatives with respect to x it is asymptotically 
equivalent to multiplication by \dS/dx it where 5 is a suitable 
function of X\, • • • , x n . For instance in the simple case n = 1 
of an equation 

1 dV 

(la) L(+, X) - — — + * - 0 

with solutions e ±x “, we have d\///dx= ±X*V'. so that S= ±ix in 

t For the principal references see a note by J. L. Dunham, On Wentzel- 
Brillouin-Kramers' method of solving the wave equation, Physical Review, vol. 
41 (Sept. 15, 1932). 

% R. E. Langer, On the asymptotic solutions of differential equations, with 
an application to the Bessel functions of large complex order, Transactions of 
this Society, vol. 34 (1932), pp. 447-480. 


839 



684 


G. D. BIRKHOFF 


[October, 


this case. Again, in the case of the Fourier’s equation with in¬ 
dependent variable taken as Xx, 


(lb) 


U+, X) 


1 1 
X 2 dx 2 + X 2 


1 ^ 

- t + * 

x ax 


0, 


we have 5= ±tx as before for suitable solutions. 

In consequence of such an asymptotic relationship we are 
led to write 


as a first approximation to and thence successively to an 
asymptotic series for 


/ Vi § t»t , \ 

+ Y + X* + ” /’ 


where Vo# Vi, • • • are functions of xi, • • • » *»»• The precise test 
for such an asymptotic series solution is of course that when it is 
substituted for * in LW, X), with the indicated differentiations 
carried out and the coefficients of like powers of X collected ac¬ 
cording to the usual formal rules, the expression L(i£, X) reduces 
identically to 0. It is not to be expected in general that such a 
series converges and yields an actual solution, although this 
may occur in special cases. Obviously it may be assumed that 
vo does not vanish identically, since otherwise we could remove 
a factor 1A from the solution. 

We shall term 5 a “phase function,” and any corresponding 
vo an “amplitude function” of 5. The reasons for this designa¬ 
tion will appear subsequently. 

We propose first to outline some fundamental facts concern¬ 
ing this classical formal process which have apparently escaped 
attention. For this purpose we find it convenient to introduce 
the modified differential operators 


d™F 

dXi 


1 dF_ 
X dXi 


d™F 1 d 2 F 

dXidxt X 2 dx%dxj 


etc. t 


In the two special cases noted above this notation allows us 
to write the equations (la), (lb) as follows: 

t See my paper. Transactions of this Society, vol. 9 (1908), pp. 219-231. 


840 


« 933-1 


dx 2 


+ + = 0; 


QUANTUM MECHANICS 
dWf 1 1 


685 


dx 2 


dx 


+ + - 0, 


while, more generally, L(*. X) may be written as a power series 
in 1/X, 

(2) L(f, X) = Lo(f) + — LxW) + • * • * 


where LM), (*-0, 1, • • •). are linear homogeneous expressions 
in yp, dMrp/dXi, d^/dxidx ft etc., and where we may assume 
that'ioGM J*0, since we ma v always multiply through by a 
suitable power of X. Evidently, then, we may write 


(3) LM 


£ (< V + 


z*; 


(O 


dx» 




/* 


a |#l * , 

-h • • •. 

dxjdxk 


Furthermore, on account of the interchangeability of the order 
of differentiation, we may assume 


(4) 


Kik 


<0 . (») .co .<o_ 

£/*; £/** = s/** = t**/ — 


The order of the expression Z,(^, X) is evidently that of the 
highest order of L 0 , L Xt • • • , say We shall assume that m 
is the actual order of LoM. and shall term L.W) the “principal 
part” of L(f, X); in dealing with L 0 we shall omit the super¬ 
scripts in referring to £ (0) , £, (0 \ 

Later on we shall have something to say concerning the 
existence of actual solutions of (1) corresponding to such 
asymptotic series solutions in the special case " I = 2 * For the 
present we shall make only the following heuristic remarks. 
(1) Solutions asymptotically represented by such series solu¬ 
tions will exist for suitably restricted ranges of the variables 
Xlt . . . , Xn . (2) These actual solutions are not uniquely deter¬ 
mined by the series solutions; for example, if ^ is so represented, 
so also will (l+c~ x )^ be if X is real and positive. (3) The solu¬ 
tion Ciil/i+Cifa, where and are represented by two such 
series, is in general represented asymptotically by the domi¬ 
nant one of the series for c^ior c 2 yp 2 - 


841 




686 


G. D. BIRKHOFF 


[October, 


If we substitute in the hypothetical series for ^ and equate 
the term independent of X to zero, we obtain an equation 


(5) 



dS 

xi , • • • »x n ; ——, 
dxi 



0 


on removal of a factor e^Vo. Here the explicit expression for P is 

_ dS _ dS dS 

(6) P = £ + — + • • • • 

i dXi a dXi aXj 


Thus P is a polynomial of degree m in dS/dx it {i — 1, • • • , w), in 
which one or more terms of degree m are actually present. 

We shall term (5) the “multiplier equation” for X)=0. 
This is an equation of the first order in 5 which does not contain 
5, and which is in general non-linear. In the two examples above 
the multiplier equation is 


1 + 




so that we find 5= ±ix t up to an additive constant. 

It is also to be noted that for a given polynomial P there is 
one and only one corresponding principal part L 0 , while L u 
L 2 , • • • remain entirely arbitrary. For convenience we shall 
term the special case in which L = L 0 the “principal equation” 
for the given multiplier equation. 

Let us proceed to the determination of Vo, Vii * • • . which 
turn out to be respectively determined by the later equations 
for k — 1, 2, • • • . We have then to substitute the expressions 
for dl l ty/dxt, • • • in the power series for L(\p, X), and equate 
the coefficients of 1A. 1A*. * * * to zero. In the case fe = l, we 
observe first that a single term only, Qe^v 0 /\, is contributed by 
the terms after the first in the series for X) ; here 


Q 


(i) 


r + 



dS 

dxi 



Thus Q is related to Li just as P is to L 0 . It remains then to 
determine the term in 1A which arises from L 0 . To do so we 
note that, to terms of the first order in 1 A. 



1 933-1 


QUANTUM MECHANICS 


687 


(7) 


a 

XS / 

< as 

_„ i 

1 

/dv 0 


dXi 

■ —' e** 1 

KdXi ° 

X 

\dXi 

+ a*< ')) 



T—+ 

1 

— 1 

i as 


- — ' e™ \ 

<La*< 

X 

dxJ 

v 0 I _ 

X dXi 




1 

— 1 

r as ia 

LdXi X dx 

dXidXj 

X 

ax J 


Vo 


1 dS dS \ 

+ — —— “ fi)> 

X dXi dXf / 

and so on. Here we use an obvious operational notation. 

Now we observe that on substitution in L 0 , the terms in Vi 
disappear identically because of the multiplier equation. Hence 
aside from the factor e xs , the coefficient of 1/X in L 0 is 

dv 0 ^ / dS dv 0 dS dv 0 d*S \ 

i dXi + 77 ' \dxi dx, + dx t dXi + dxidxi V 

/ dS dS dv 0 dS dS dv 0 ^ dS dS dv 0 

dXj dx k + dXi dXk dx t dXj dx k dx t 


/ dS 
\dxi 


d*S 


dS 
+ — 


d*S 


dXjdXk dx/ dXidXk 


— ~^-)vo) + 

dXk dXidXj / / 


where the general law of formation is obvious. But on account 
of the symmetry relations (4), this may be written 

_ dv 0 AS dvo ^ AS AS dv 0 

£*< — + 2 £«„— — + ^ £«<>*— — — + • • • 

i dXi dxt dXj ijk dx§ ox* 


+ t ( 21 

2 \ n 


d'S 


dXidXj 


^ as 
+ 3-2E — 


d 2 S 


<jk dxt dxydx k 


^ Vo- 


But the coefficient of dv 0 /dxi in the first line is dP/dyi if we 
write yi = dS/dxi in P; and the coefficient of d 2 S/dx % dxi in the 
second line is (v 0 /2 )d 2 P/dy x dy h Hence the required condition for 
k = 1 may be written-in the form 


dP dv 0 1 /—» 

(8) £ — — + — (£ 

dyi dXi 2 \it 


d 7 P 


a*s 


v 0 = 0. 


dyidyj dxidx 

Let us next throw this linear differential equation in Vo into 




688 


G. D. BIRKHOFF 


[October, 


a different form, by use of the curves = Xi(r), (f=l, • • •, «), 
defined by the n ordinary differential equations of the first order 



dxi _ dP 

dr dyi 



Along any such curve in ^-dimensional (*!••• x„)-space, (8) 
may be written 


( 10 ) 


dv o 

— + *Vo -= 0 , 

dr 


where <i> is the coefficient of v 0 in (8). The solution of (8) is 
therefore 

(11) v 0 = v 0 *eS*dT' 

Thus the equation for k = l may be looked upon as deter¬ 
mining the value of v 0 throughout a tube of these integral 
curves, once v 0 has been assigned values v 0 * on a particular 
transversal surface 2. The later coefficients do not enter at this 
first stage k = l. 

If now we turn to the later equations for any k (£>1), it is 
clear that these have a similar form 


( 12 ) 


dvk—x 

—--h + Ak-i 

dr 



where A k -1 is a known linear differential expression in v 0 , 
’ • » v k- 2 . Hence we find that v 0 , Vu • • • are determined in 
succession by their values on the transversal surface 2. 

In particular we may suppose that v 0 , v lt • • • are given arbi¬ 
trarily on a small region a of 2, continuous together with all of 
their partial derivatives in <r but vanishing along the boundary 
of and outside of <r. Evidently these functions will then vanish 
similarly all along the tube and outside of it. We shall refer to 
a formal solution ^ of this nature, as an “asymptotic wave 
packet” solution for obvious reasons. 

3. The Associated Canonical Equations . Let us assume now 
that 5 is not only a solution of the partial differential equation 
P = 0 t but belongs to an n-parameter family of solutions 

S (*U ••*»*»; «lt , Cn-l) + c n , 



QUANTUM MECHANICS 


689 


1 933*1 


involving n constants C\ t • • • , c nt one of which, c n , is additive. 
We shall assume that then —1 constants Ci, • • • , x are “inde¬ 
pendent” in the sense that if we write y,* = dS/dxt so that always, 
by (6), P(x i, • • • , x n ;y lt • • • , y n ) = 0, the yS s may be regarded 
as independent except for the relation just written, that is, we 
shall assume that the nX(n —1) matrix 

\\d 2 S/dxidCi\\, (i - 1 , • • • , n\j = 1 , • • •, n - 1 ), 

is of rank n —1. In general an arbitrary (non-singular) solution 
5 of (5) can be imbedded in such a “complete solution.” 

But by differentiation of the equation yi = dS/dxi along a 
curve yielding a solution of (9) for any particular set of values 
of Ci, • • • , c n _i, we obtain 


dyi 

dr 



d'S dXi 

dxidxj dr 



d'S 


dXidXj 


dP 

dy f 


by use of (9). On the other hand, by partial differentiation of the 
identity (6) as to we obtain 


dP " dP d'S 

--f £-- 0. 

oxi fmm i dyt dx^Xi 

By combination of this equation and the one which precedes it, 
we conclude that the following equations also obtain: 


(13) 


d yi _ dP 
dr dXi 



It will be observed then that the equations (9), (13) yield a 
canonical system in x it y it (t = 1, • • • , n), of the 2nth order, 


(14) 


dx^ _ dP dyt dP 

dr dyi* dr dx^ ^ 


having a principal function P not containing the independent 
variable r. 

Moreover we have 


d = £ d 2 S <**/ _ d*S dP 

dr\dci/ j dddXj dr , dadx,- dy f 



690 


G. D. BIRKHOFF 


[October, 


since dP/dCi = 0. Hence we infer that the solutions of (14) under 
consideration satisfy the equations 


(15) 


dS 

d < = —’ (* = i, •••,»-1). 

da 


But *!,•••, Xn, yu • * * » y n can evidently be taken as any 
point on the (2 n — l)-dimensional manifold P = 0 at r = 0 so that 
we obtain in this way the general solution P = 0 of (14) in the 
form 

dS dS 

(16) y { = —, (*' - 1 , " (* - 1 , •••,»- 1 ), 

dxi da 

where S(x\, • • • , x n , c u • • • , c»_i) +c n is a solution of the multi¬ 
plier equation of the specified generality. It will be observed 
that there are In —2 arbitrary constants involved, namely 
Cu • • • , c n _i, di, • • • , Thus (16) defines a (2» —2)-parame¬ 
ter family of curves filling up the (2n— l)-dimensional manifold 
P = 0; the parameter r is then determined by setting 


dr 


dxi 

dP/dy * 


d yn 

dP/dXn 


and integrating. 

It need hardly be remarked that the equations (14) define the 
Cauchy characteristics of the partial differential equation P = 0, 
in the theory of the solution of which the equations (16) play a 
well known fundamental part. Thus we may state the following 
result. 

As t varies, an asymptotic wave packet solution of L(\p , X) =0, 
belonging to a non-singular phase function S and an amplitude 
function v 0 , travels along the corresponding Cauchy characteristic 
in Xi • • • x n space, defined by the canonical equations (14), 
where the initial values of y< are given by the equations 

y% = dS/dxi t (i= 1, • • • , »). 

4. On Certain Integral Invariants. We propose next to give the 
condition (8) an alternative integral invariant form. It is well 
known that an integral such as 


(17) 


1=1 Gvfdxi • • • dx n 
^F(r) 





QUANTUM MECHANICS 


691 


1933-1 


will be invariant as r changes in case a certain divergence van¬ 


ishes: 



= 0. 


Here V(t) is a volume in X\ • • • x n space which moves as r 
changes in accordance with equations (14), where yi = dS/dxi . 
This yields 


_ dG dP /_ dP dv 0 \ 

Z- Vo k + *G(Z- 

i dxi dyi \ i dyi dxj 


-l 


\ i ox {dyi a 


d 2 P 


d 2 S 


'i 


dyidyi dXidXj / 


v 0 k = 0, 


which by virtue of (8) reduces to 

. . . d 2 P 

(18) 


z ——+(z 

< dyi dx , \ i 


dxidyi 

2 dyidyi dxidx f 


/ k v_ d 2 P d 2 S \ 

-Ml - — )Z - kQ)( 


0. 


In this equation *i, • • • , ** are the independent variables since 
dS/dxi is substituted for y, throughout. It may be noticed here 
that G = const, is a solution for k = 2, Q = 0, provided that 

(19) 

regardless of the choice of y,-; we shall have occasion to employ 
this result later on. 

Conversely it is evident that in general if for the given S and 
k, a function G is a solution of (18), and if fv<r)Gv 0 k dx l • • • dx n 
is an integral invariant for any region V, then v 0 must satisfy the 
equation (8). 

We may now announce the following result. 

If v 0 is any amplitude function of the non-singular phase func¬ 
tion S and if G satisfies the linear partial differential equation (18) 
for this S, and some k, then 


d 2 P 

dxidyi 


0, 




692 


G. D. BIRKHOFF 


[October, 


is an invariant integral for any volume V(t), where X\ , • • • , x n 
vary with r in accordance with (14) (with yi = dS/dxi). Conversely; 
if this integral is invariant for all regions V(t), then v 0 is an ampli¬ 
tude function which satisfies the equation (8). 

5. The Schrodinger Wave Equation. Suppose now that we take 
n-fl variables t, x i, • • • , x n , with 


P 


dS ( 




so that the “multiplier equation” takes the form of the usual 
Hamiltonian equation. The corresponding principal equation 
UW) =0 is then the usual Schrodinger wave equation 


( 20 ) 


2 iri d\p / 2iri d 

“T“ — + H ( *i, • • • , x„; — — , 
h dt \ h dxi 


2ici d \ 

IT “ °* 


provided we take X = 2 t ri/h. 

The Schrodinger equation is therefore merely the principal equa¬ 
tion which has the usual Hamilton-Jacobi partial differential 
equation as its multiplier equation , with X = 2iri/h. 

Furthermore, the corresponding Hamiltonian equations are 



together with 


( 21 ) 


dxi dH dyi 
dr dyi dr 





with yi = dS/dxi. Hence we find r = /+const., dS/dt = const, 
along any trajectory. But in this case a complete solution can 
be found in the form 


S = S*(x i, • • •, x n ; Ci 9 • • •, c n _i) + c n t + Cn+i 

with yi = dS*/dxi for i= 1 , • • • , n. Hence the equations for the 
Cauchy characteristics reduce to the ordinary Hamiltonian 
equations(21) with t = r , associated with Schrodinger’s wave 
equation. 

Furthermore, it is easily proved that 



QUANTUM MECHANICS 


693 


1933J 


I = I pypdtdx 1 • • • dx nt 
Jv<r) 

where yp denotes the conjugate of yp, is then an integral invariant 
to the first order in 1/X, at least if 

d*H 

-■ 0. 

dxidyi 

In fact, since X = Irri/h is a pure imaginary, we have 

yp ~ e^Vo, yp ~ e~ xs Vv 

for S and v 0 real, so that the above integral reduces essentially 
to fv ( r)Vo 2 dtdxi • • • dx n , which is of the form treated above 
with Gail, k = 2\ furthermore, we have Q = 0 in this case. But 
when rectangular coordinates are employed, we have also 

II = V(x lt •••, x m )+j , 

where V is the potential energy and y u • • • , y n are the momen¬ 
ta! coordinates corresponding to * 1 , • • •, x n , respectively. Hence 
(19) and (19') obtain, and the integral I is invariant as stated. 
Finally, since dr = dt, it follows at once that fviryvfdxi • • • dx n 
also remains constant over any region in x\ • • • x n space. Thus 
we arrive at the following conclusion. 

In the special case of the Schrodinger equation in yp with rectan¬ 
gular coordinates , the asymptotic wave packets follow the corre¬ 
sponding dynamical trajectories , while the squared amplitude 
integral f\yp\ 2 dx\ • • • dx n remains constant over any part of the 
packet , to terms of the order of h. 

6. On Change of Independent Variables. Suppose now that we 
make any change of independent variables 

= /<(* 1 , • • * , x n ), (* *= !,*••> w )- 

Because of the identities 



dxi 


dXi d^yp 
j dXi dXf 


d[2] P _ dik dil di2] ' f/ + 1 2 

dXidXj k ,i 6x4 dXj dikdxi X * 


d 2 Xk 

dx^Xj 


dMp 

+ • • • , 
dx k 



694 


G. D. BIRKHOFF 


(October, 


the components Lo, L\, • • • in L do not remain individually 
invariant in the equation ( 1 ), and in particular the principal 
part L 0 will not carry over into the new principal part by the 
ordinary rules. In fact the coefficients transform by the rules 
valid for the attached Hamilton-Jacobi equation. Hence Schro- 
dinger’s wave equation in the form (20) is only maintained (in 
general) under a linear transformation of the independent vari¬ 
ables. 

This fact indicates that any coordinate system from which we 
start is to be regarded as a privileged absolute system of reference 
for the Schrodinger wave equation , up to an arbitrary linear trans¬ 
formation. 

7. Linear Systems and the Dirac Equations. Let us now turn 
briefly to a system of k homogeneous linear partial differential 
equations in \p lt • • • , \p k : 

(22) £«(*,, • • •, <p k \ X) + + . . . = o, (i = 1, . . . f k). 

A 

Here x lt are the independent variables, and the same 

operational symbols have been introduced as above. Now each 
L i0 may be written as a sum, 2Z,Z,„ 0 , where L ii0 contains only the 
terms of L i0 which involve Furthermore there is then a cor¬ 
responding set of polynomials P if (* lf • • • , y u • . . , y n ) ob¬ 
tained as in the special case k = 1 treated above. 

If we use formal series solutions, 


= eXS ( t ’*° + X + • • • ) ’ k), 

and substitute in the k given equations, the leading terms give 
us the k equations 



In order that these be consistent, the “multiplier equation” for 
the system 

= 0 


must be fulfilled. Furthermore if the “phase function” 5 satisfies 
this multiplier equation, then the linear equations just written 
determine the k functions i pi up to a proportionality factor. 



QUANTUM MECHANICS 


695 


•933-1 


For the determination of this proportionality factor v and so 
of Viot of Vn, • • • , we might proceed as before with a greater 
degree of algebraic complication of course. It is sufficient for our 
purposes, however, to observe that here too the phenomenon of 
wave packets occurs. In fact by elimination we may reduce the 
given system in various ways to a single linear differential equa¬ 
tion in a single unknown function \p, linear in \pi and their partial 
derivatives. Its multiplier equation is then essentially P = 0, 
with the same polynomial P as before, since the phase functions 
are the same as before. Hence asymptotic series solutions for \p 
having the nature of wave packets exist, associated with this 
particular P, and so there exist also the corresponding solutions 
ipi, • • • , \!/ k . Thus there exist asymptotic wave packets for the 
system which follow the Cauchy characteristics belonging to P . 
The arbitrary proportionality factor in v l0 on a transversal 
surface 2 corresponds to the arbitrary v 0 on 2 in the series 
for \fr. 

Thus it is clear that any system of “wave equations” is corre¬ 
lated with a multiplier equation and the allied set of character¬ 
istics. 

Now the well known work of Sommerfeld showed that the 
program of Schrodinger leads to a successful theory of the fine 
structure spectral lines, if one takes account of the special the¬ 
ory of relativity in a natural way; but that it fails to account 
for certain magnetic properties of the atom. Pauli then substi¬ 
tuted a system of two wave equations of the first order for the 
single Schrodinger equation of the second order so as to bring 
about the indicated modifications; the multiplier equation P = 0 
obtained is again that necessitated by the special theory of 
relativity. Finally Dirac obtained a system of four equations of 
the first order, with multiplier equation P 2 = 0. 

Without attempting to analyze the formation of the elegant 
equations of Dirac, it may be pointed out that the retention of 
the multiplier equation in unaltered form is of itself sufficient to 
ensure the proper general form of the characteristic numbers 
and functions (see the second part of this lecture). It becomes 
then a very puzzling problem to discover whether the equations 
of Dirac are to be regarded as more than a set of equations built 
ad hoc. This is an issue upon which I do not feel myself compe¬ 
tent to pronounce. 



696 G. D. BIRKHOFF [October, 

Part II. The Wentzel-Brillouin-Kramers Method and 

Asymptotic Series 

8. Formulation of the Problem .f In order to simplify the form 
of statement of the problem, we shall write the wave equation 
to be considered in the form 

(23) ^ + X«(£ - V(xM = 0, (\* = , 

and assume that V(x ) is real and analytic for all real values of 
x, and that it possesses a single (absolute) minimum for x = x 0 , 
such that dV/dx = 0, d 2 V/dx*> 0 at * 0 , while dV/dxj*0 for 
Xt*x o. Finally we assume that for \x\ large, V admits a con¬ 
vergent series expansion in 1/x of the form a + b/x+ • • • , so 
that lim V(x) =a as x becomes infinite. While this is a somewhat 
idealized form of the case of physical importance, it will be 
found that the method proposed in justification of the final 
formula can be extended without essential modification to the 
cases of physical interest. 

We are concerned with the real solutions ^ of (23) which van¬ 
ish both at x = — oo and x = + «© , for real and positive X. Now, 
for E ^ V 0 , the coefficient of ^ in (23) is everywhere negative 
or 0, and elementary oscillation theorems show that no solu¬ 
tion ^ can vanish for x= ± oo. On the other hand, if E>a, this 
coefficient is everywhere negative, and all solutions oscillate 
indefinitely often with an amplitude that need not approach 0; 
this corresponds to the possibility of a continuous spectrum. 
Hence we may assume that E exceeds V 0 but is less than a. 

Let us for the present consider X as a large positive parameter 
while E is taken to be restricted as stated. We have then a 
boundary value problem of classical type, but with the difficulty 
arising from the singular nature of the boundary conditions 
(the boundaries lie at infinity) and from the fact that the coeffi¬ 
cient of \f/ changes sign twice. As far as I know problems of this 
singular type have not as yet been treated (see, however, Langer, 
loc. cit.). 

t VVe follow A. Zwaan and J. L. Dunham (loc. cit.) in using a complex 
variable x. Zwaan's treatment is extremely suggestive, although lacking in 
essential respects. 


852 



QUANTUM MECHANICS 


697 


1 933*1 


9. An Auxiliary Lemma. In order to proceed further we shall 
need the following lemma. 

Lemma. In any leaf-shaped region a of the complex x-plane in 
which V(x) is analytic and which can be covered by a regular 
family of curves from two points P and Q of its boundary , in such 
wise that 


(24) <R((K(x) - Eyi'dx) * 0 

along each curve, f there will exist two solutions ^(x, X), (i= 1, 2), 
analytic in x and X, and asymptotically represented by the usual 
formal series solutions $<(x, X), (i = 1, 2), throughout a. 

This lemma is a special case of an obvious extension of results 
contained in my doctoral thesis.J 

10. Application of the Lemma. By use of the above lemma, it 
is possible to determine such regions o. We restrict attention to 
the neighborhood of the axis of reals in the x-plane, and let x = a 
and x=0 (a <x o <0) denote the two values of x for which 
V(x)—E vanishes, so that V(x)—E is positive or negative ac¬ 
cording as x lies outside of or within (a, 0). For x =a this func¬ 
tion decreases, while for x = 0 it increases. 

Let us make a cut in the complex x-plane from — oo to a, and 
from/? to + 00 , along the axis of reals and consider (V(x) —E) 112 
in the cut plane; here we take the positive branch on the upper 
side of the cut ( — <*>, <*), and then determine this function 
throughout the cut plane. Evidently (F(x)— E) 112 is a pure 
imaginary quantity with negative coefficient of i on the real 
axis between a and 0, and is a negative real quantity on the 
upper side of the cut {0 , + 00 )• On the lower side of the two cuts, 
the function is of course equal to the negative of its value at the 
same point on the upper side. 


t ‘R denotes the “real part of"; E is regarded as fixed. 

t On the asymptotic character of the solutions of certain differential equa¬ 
tions containing a parameter. Transactions of this Society, vol. 9 (1908), pp. 
219-230. It will be found that for an equation of order n with real parameter p 
(notation of my paper) it is sufficient that along the curves PQ in o we have 

Pfwidx) 1 <R (widx)^ • • • ^ c RjlWndx). 

This convenient condition for maintenance of asymptotic form in o was known 
to me in 1908, and is indeed obvious from my paper. 


698 


G. D. BIRKHOFF 


[October, 


Now let us consider the curves 

(25) <R«V(x) - Ey'dx) = 0, 

which evidently play an important part in the determination of 
possible regions <r. Under the assumptions made above, the gen¬ 
eral nature of these curves near the x axis is readily seen to be 
that indicated in the figure below. 



This leads at once to five special types of regions a : I: a above 
2a/33; II: a above la/34; III: a below la/34; IV: <r below 2a03; 
V: <r between la2 and 304. 

Here the boundaries of these regions are usually to be ex¬ 
cluded. We shall, however, assume that the point P or Q of the 
Lemma in regions a of types I-IV may be taken at an infinite end 
of the real axis. A critical investigation of the validity of this 
reasonable assumption is being undertaken by Mr. A. C. Gal¬ 
braith at Harvard University. 

11. The Distribution of the Characteristic Values. With these 
preliminaries in hand we are prepared to determine the distribu¬ 
tion of the characteristic values X. In the first place we are only 
interested in the solutions \p of (23) which tend to 0 as * tends 
to ± co. Now we have two formal solutions 

*,(*, X) ~ e x S“ X), s t (x, X) ~ X), 

where we take x on the upper side of the cut ( — co , a) to begin 
with, and ( V—E) 111 as positive there. Here the /,(*, X) are ordi¬ 
nary power series in 1/X, and we may suppose that $i and h go 
into St and t%, respectively, as we traverse the cuts. 


854 



QUANTUM MECHANICS 


699 


* 933-1 


Now the unique formal solution of the first type above which 
reduces formally to 1 for x = y (see the figure) is clearly 

si(x t X)A,( t , X) = +A<r-*> w «i4x, \)My, X). 

According to the Lemma there is a corresponding \f/i having this 
asymptotic form in the region I, which will evidently approach 
0 exponentially as x approaches — co. Similarly there will exist 
a solution \p 2 represented by s 2 (x, \)/s 2 (y, X) which approaches 
oo under the same circumstances. Hence ^i(x, X) is essentially 
the only solution (up to a constant multiplier) which remains 
finite as x approaches — co, and ^i(x, \)/\fri(y t X) is a special 
solution, \p*(x, X), with the same asymptotic form, which re¬ 
duces to 1 for x = y. This solution is clearly real for x real, since 
V(x) is real for x real. Hence is represented asymptotically by 
Si(x,\)/si(y,\) in the combined region I + III, since fora; below 
the real axis, \fr* is conjugate to its value at the conjugate point 
above the axis, and Si(x, \)/si(y, X) has the same formal prop¬ 
erty. Taking account of the cut, however, we have 

(Si(*, X)/5|( 7 , X) in I, above, 

(x. X) ~ < 

l* 2 (x, X)/$ 2 (7, X) in III, below. 


Similarly in the regions 11 +IV we are led to fix attention upon 
a solution 



s i(*> X)/*i($, X) II, above, 
$i(x, X)/* 2 (6, X) in IV, below, 


as yielding the only possible solution which approaches 0 as x 
becomes positively infinite. It is to be noted that Si(x t X) is rep¬ 
resentative of the formal solution which is asymptotic to 0 as * 
approaches +co along the upper side of (fi , <»). Moreover, for a 
characteristic value and only then, these two solutions must be 
proportional; that is, their ratio must reduce to essentially the 
same function /(X) in the overlapping parts above la/33, and 
below 2a/34. Hence we are led to the necessary relation 

(26) 

si(y, X) s 2 ( 7 , X) 

Consequently if we write 

Si(x, X) = eS* ( ‘ dlo *+i ld *) dx . 


(i = 1, 2). 



700 


G. D. BIRKHOFF 


(October, 


so that \J/i is a formal solution of the wave equation (23), the 
above relation yields 



d log ^i(x) 

- 3 - dx + 

dx 



‘ d log 

- dx ~ 2km, 

dx 


(k an integer). 


or, more briefly, since d log changes to d log fa as we traverse 
the cut, 



d log 1 A 1 

- dx 

dx 


2km, 


where the path of integration is a positive loop around the 
points a and /3. Written out explicitly this gives the series 



(K(x) - Ey»dx + 


(* + *>* 

- 1 (k 

(2m)*/* 



which is essentially the desired Wentzel-Brillouin-Kramers 
equation. 

We shall not attempt to consider the degree of precision with 
which this equation leads to satisfactory approximations to 
the energy levels E 0 , E \, • • • . It may be anticipated, however, 
that the approximation will be good whenever the terms of the 
series on the right diminish rapidly; and that an exact result is 
obtained in case the series involved converges. 

Harvard University 


856 


Reprinted from Comptes rendus du Congres International des Mathe- 
maticiens, Oslo 1936, Oslo, 1937, pp. 207-225. 


EXTRAIT 

Co n gr is International des MathS nxaticict 
Oslo 1936 

THE FOUNDATION OF QUANTUM MECHANICS 

By G. D. Birkhoff, Cambridge, Mass. 

Introduction. 

In this Address I propose to give in outline the general point of 
view concerning the foundations of quantum mechanics to which I have 
been led in recent years. My paper will consist of three parts: a first 
part in which I take up a purely mathematical*thread involving the general 
theory of asymptotic series which, as I have shown previously, serves to 
join together many of the accepted quantum mechanical ideas of the present 
day, and which will undoubtedly be of use in further developments;’ a 
second part in which I give a brief critique of earlier and current physical 
theories with the purpose of mentioning the logical difficulties involved ; 2 
and a third part dealing with the further development of an explicit con¬ 
ceptual theory which I first outlined in 1926 , which seems to me to afford 
a highly suggestive physical model for quantum mechanics . 3 

Within the compass of the present brief paper I can do no more than 
sketch some of the main results. It is my intention to supplement this 
account by an extensive Memoir to be published elsewhere. 

Part I. 

Linear Parametric Wave Equations and Quantum Mechanics. 

1 . Parametric IVave Equations. IVave Packets. 

By a linear parametric wave equation I mean an equation of the type 
( 1 ) Hv,X) = 0 

in which L denotes any linear homogeneous differential expression in the 
dependent variable yt with independent variables x x , • • •, x n and involving 
a large parameter X. In order to avoid difficulty I assume that all the 
coefficients are analytic in the variables x tt • • x mi X, and are expansible in 
convergent power series in l /x for X large. 


1 See my paper “Quantum Mechanics and Asymptotic Senes” published in the Bulletin 
°f the American Mathematical Society in October, 1933. 

1 See my paper “A Mathematical Critique of Some Physical Theories” published in the 
Bulletin of the American Mathematical Society, March-April, 1937. 

» See my Notes entitled “A Theory of Matter and Electricity” and “The Hydrogen Atom 
and the Balmer Formula” appearing in the Proceedings of the National Academy of 
Sciences , March 1927. 


857 


207 



Now it has frequently been found that certain important solutions yt 
exists which have the peculiar property that their derivatives with respect 
to Xt are asymptotically obtained by mere multiplication by XdS/dxi where 
5 is a suitable ‘phase function’, so that we have thus, in the 

simple case of the equation yi' + X*y> = 0 we have two exact solutions 
e ±x,x so that S = ±ix in this case. If we take account of higher terms 
in such an asymptotic relationship, we are led to search for formal solutions 
of the type 

( 2 ) j + J 


where v 0 , v t , • • • are functions of x x , • • •, x m . It is found for the purposes 
of the applications that the question of the convergence or divergence of 
such formal series need not be considered. 

By direct substitution and comparison of coefficients we may determine 
all of the conditions which must be imposed upon the multiplier function 
5 and the successive coefficients v 0 , \v x , • • • in order that a given series 
constitutes a formal solution. For the purpose of carrying out the formal 
reckoning it is very convenient indeed to introduce the following abbreviated 
notation, which I used in 1908: 

HI |2| 

3 w _ 13 w d w 1 3 1 w 

131 a7, _ I a7• 3x,dx,~ T* 9 *,9 


The first of the conditions so obtained is then an equation determining S, 


( 4 ) 


p(*i. 



9S\ 
' 3 x.) 


=* 0 . 


Here the explicit expression for P is a certain polynomial of degree nt in 
dS/dxi (1 =!,••*,«), where m is the order of the given wave equation. 
This equation may be termed the 'multiplier equation’. Thus the possible 
phase functions are obtained as the solution of a partial differential equation 
( 4 ) which is polynomial and of the first order in the dependent variable 5, 
but in which S does not itself appear. 

If we proceed further we are able to calculate v 0 , v lt - • • in succession. 
This turns out to occur in the following manner. 

Associated with the function S there is a corresponding canonical 
dynamical system with Hamiltonian function P; namely, 


< 5 > 


d Xj = d_P dy^ 3 P 

dx 3 yi’ dx 3 Xi‘ 


208 


858 




This canonical system will define certain trajectories in x x , •' •, x„ space, 
with general solution given by the 2 n equations 


( 6 ) 


yi = 


dS dS 

dXi' dCi ' 




where c x , • • ■, c„ are the n arbitrary constants contained in a complete 
solution S of the multiplier equation. If we let t denote the time along 
such a trajectory, the successive conditions upon v 0 , take the form 


(7) 


<i v 0 
dx 


+ <[>v o = 0 


^ + <Pv x + <P x (v 0 ) = 0 

+*v, + 'P, (»,) + *, (i» o )=0 


where <I> is an explicitly given function and where <P X , <P 9 , • • • stand for 
explicit homogeneous linear differential expressions of order indicated by the 
subscript. Hence, if -2* is a surface in x x , •• •, x n , space, which cuts the 
^dimensional family of trajectories at an angle different from zero, then 
the functions v 0 , u x , • • • may be assigned at pleasure on 2 , but are then 
determined throughout x x , • • *, x nt space. In particular we may suppose that 
these functions v 0 , v x , ■ • • are assigned arbitrarily on a small region o of 
the transversal surface 2 and are continuous together with all of their 
partial derivatives, but vanish along the boundaries of this region o and 
outside of it. In this case we obtain an asymptotic 'wave packcl’ solution 
corresponding to the given phase function, which vanishes everywhere ex¬ 
cept within the tube of trajectories standing on the given small region o. 

Except in the case of the single ordinary differential equation (w=ll, 
the domain in which such formal solutions correspond to actual solutions 
has not as yet been determined. There is no doubt, however, that such actual 
solutions exist in general in suitably restricted domains of x x , •••, .r„ space. 

For our purposes it is worth while to indicate how the multiplier 
equation is actually to be obtained in any given case. We begin bo¬ 
using the symbolic operator mentioned above and writing the given linear 
wave equation in the form 

(8) L — L 0 + -t L x + j, L % + • • • 


14 — Conj-r** dc» mathematiciens 1936 . 


209 


859 




by use of this operator. Here L 0 , L lt -are themselves linear and homo¬ 

geneous in products of the operators 3 il bdxi (*= 1, • * •, n) and do not contain 
explicitly the parameter X. Thus if we began with the equation of Fourier, 

V* + ^ V + *V = 0 . 

the expansion referred to would be 

I2| HI 



The rule giving the multiplier equation is then the following: The 
polynomial P in the multiplier equation P=0 is obtained by replacing 
HI IH 1*1 

the derivatives dy>dx x , ••*, dxp/dx H , 9 yt/dx? • • • in L 0 by the respective 
expressions dS/dx t , •••, dSdx H , (dS/dx x )*, • • *. Thus in the case of the 
equation just mentioned the multiplier equation is immediately seen to be 

+ 1=0 

which yields, of course, S=±ix. 

Suppose now that we consider more generally a system of k linear 
homogeneous partial differential equations in » ' * *» V* 

(8') Li^L t o+ j L>\ + Z./2 + - • * = 0. (*“!»•••, A). 

Here the various operators Li have been expanded by the use of the 
symbolic operator in the manner indicated above. In this case we search 
for formal series solutions of the following type 

(2') y>i=P s |v,o + 

As before, an infinite series of conditions is obtained, the first being k 
equations, 

( 9 ) 2 /> " ! ' /0=0, ( 1 — !,•••,*), 

/=' 

where P ,/ is obtained from Ln just as P was obtained from L in the 
simple case of a single equation considered above. Thus there arises a 
determinantal multiplier equation, namely 

( 10 ) Pm x P tl \ — 0. 

If the phase function S satisfies this partial differential equation, then the 
n linear homogeneous algebraic equations ( 9 ) determine the k functions 

210 




860 



v \o , * * •» VAO to a proportionality factor *n). Here we suppose that 

the rank of the determinant P is n— I. We then proceed to the later sets 
of conditions determining /t, vn , • • •. It turns out that voi, vu, • • • are 
successively determined in much the same way as v 0 , v x , • • ■ were above, 
although there is, of course, a greater degree of algebraic complication. In 
fact, if we set up the canonical system ( 5 ) associated with the Hamiltonian 
function P just as we did before, we find that there will exist ‘wave packet’ 
solutions which vanish except along a small tube of trajectories. 

Thus there is associated with the single parametric wave equation and 
with the system of such equations, wave packet solutions which vanish save 
near to one of the associated dynamical trajectories. In the case of the single 
linear parametric wave equation or of the system of such wave equations, 
the formal wave packet solutions no doubt correspond to actual solutions 
in an asymptotic sense. 

In this way we arrive at a very general association af ‘waves’ and 
'particles’ of the kind of which so much use has been made in current 
quantum mechanics. 


2 . The Ordinary and Relativistic Schrddinger Wave Equation. 

In order to arrive at the SchrOdinger wave equation of quantum me¬ 
chanics the following rule is specified: The desired wave equation is ob¬ 
tained from the Hamilton-Jacobi partial differential equation 


(11) 


?S 1 1 j( dS dS \ n 


by regarding this last equation as the ‘multiplier equation’ with X = 2 7ii/h 
(//, Planck’s constant). Here the variables x x , • • •, x n are geometrical vari¬ 
ables of Cartesian type attached to the atomic dynamical problem, y x , • • •, y n 
are the associated conjugate variables, and H denotes the total energy. Thus 
we obtain the SchrOdinger equation in the symbolic form 


(12) 


1 d ( 1 a 

X dt’ ¥H \ Xx '"' tXm ’ X dx x ’ 




If now we set 


xp = e 


—XEt 




the equation takes the simpler form 

( 13 ) [H-£)v'=0, 

21 I 


861 



in which the time t no longer appears. If further there is adjoined the 
condition that the function y> is to vanish at infinity, we obtain a linear 
boundary value problem of classical (Hermitian) type. 

The fundamental rule of non-relativistic quantum mechanics for de¬ 
termining the atomic frequencies can now be formulated as follows: the 
characteristic numbers Ei in the boundary value problem thus specified are 
taken to determine the ‘energy levels' Ei while the spectral frequencies v m „ 
themselves are given by the formula 

( 14 ) h Vmn = E m En, 


in accordance with the Planck-Einstein law. 

As a matter of fact, this program has had to be modified considerably 
by the introduction of ‘electron-spin’ and the use of a further arbitrary 
rule given by the Pauli exclusion principle. 

In the program of quantum mechanics as thus far specified there is 
no provision for taking account of the relativistic framework appropriate to 
the electro-magnetic field. As a step in remedying this deficiency a some¬ 
what analogous but distinct point of departure was suggested by SchrOdinger. 
This is as follows. We begin by writing down the relativistic equations 
for the motion of an electron in the field produced by a static proton at 
at the origin of co-ordinates. However, instead of using the co-ordinates 
x,y, e, of the ordinary Cartesian type, we now employ the four-dimensional 
co-ordinates, x t =x, x t =y, x 9 —-z, x^cit appropriate to the space-time 
of the special theory of .relativity, while taking the proper time s as a new 
independent variable. In this way we are again led to a dynamical system 
of Hamiltonian type with principal function 

4 

( 15 ) AT- »-**'>* 

i=\ 


where e<f > t , • • •, e<J> A denote the four components of the usual vector potential 
associated with the charge e at the origin. This principal function AT does 
not correspond to the total energy since we have AT=0 for any possible 
motion. In this case we are at first led to use the symbolic equation 


(16) 


ATIv.i) 


Sfra —1 


v =o 


as the conjectural relativistic wave equation. Here K= 0 is taken to be 
the multiplier equation. 

212 


862 



Unfortunately, this type of relativistic wave equation does not lead to 
the desired ‘fine structure’ formula for the spectral lines of hydrogen, since 
it introduces certain half integral quantum numbers in place of the correct 
integral values appearing in the Sommerfeld formula. 


3 . The Dirac Equations. 


An obvious suggestion which now arises is to take as the starting 
point a system of linear parametric wave equations rather than a single 
wave equation; in particular the single particle problem suggests the possi¬ 
bility of beginning with four parametric wave equations of the first order 
in four dependent variables *'*»?« and lhe four independent variables 
x x , x t , x t , x A . If we take the simplest possible case of a free particle, it 
is natural to expect that the coefficients in the differential equations then 
reduce to mere constants, and that the wave packets necessarily follow the 
beam of light in this case. Evidently such a parametric wave equation may 
be given the form: 


(17) 


£ 




ciik 

3 Xk 



with multiplier equation 


^Cificyk+h/ 


u 1 


0 (« 5 //=» 0 , * du = 1 


where we must have essentially 


(18) 


A{A+p') (J-y? +y\ +yl +yj 4-1) 


if the wave packets are to follow a rectilinear path with the velocity of light. 

Now the simplest way in which the multiplier equation P — 0 might 
take the desired form is for all third order minors in P to contain A as 
a factor. As I mentioned in my article of 1927 referred to above, this 
leads directly to a determination of the constants C//* which is the same 
as that given by Dirac. 

How then are we going to pass to the more general case where the electron 
moves in a general static electro-magnetic field? The well-known formalism 
of the theory of electro-magnetism suggests immediately that the proper equa¬ 
tions will be obtained by replacing y x ,‘‘-,y A by y x — e*P x , • • •, y 4 — e<J> A 
respectively. It is found that this heuristic process yields a linear boundary 
value problem of Hermitian type leading to the correct fine-structure formula 
for the hydrogen atom. Thus, by a process of genial conjecture, a rule 

213 


863 



for obtaining spectral frequencies has been devised. However, this leaves 
quantum mechanics in a position analogous to that of theoretical astronomy 
after the discovery of Kepler’s laws. In fact, calculations of spectral fre¬ 
quency can be made with a considerable degree of accuracy; but a proper 
conceptual background is lacking. If this be the case, we may well look 
for a conceptual theory which explains quantum mechanical laws much as 
the gravitational theory of Newton explained the laws formulated by Kepler. 

In Part III of the present paper I have ventured to present an attempt 
at the construction of such a conceptual theory. 


Part II. 

Critique of Previous Physical Theories. 

X. Some Mathematical Difficulties in Classical Physics. 

In trying to construct a conceptual model useful for quantum mechanics 
I have found it of great service to take explicit account of certain mathe¬ 
matical difficulties inherent in the earlier physical models. 

At the very outset we may lay to one side the difficulties which 
physicists felt to be present in the Newtonian Law of Gravitation, such as 
'action at a distance’. In fact the relativistic theory of gravitation due 
to Einstein fits naturally into the conceptual scheme here proposed. 

The first difficulty (i) to which I wish to call special attention is that 
of indeterminacy. This arises when we use the rigid bodies and particles 
of classical dynamics as models. If, for example, we attempt to build 
the kinetic theory of gases upon the model of an assemblage of equal rigid 
spherical molecules, the difficulty of indeterminacy arises as follows. Suppose 
that three equal spheres approach a point with equal velocity, while the 
lines of motion of the centers are 120 ° apart and in the same plane. If 
the spheres collide simultaneously, considerations of symmetry indicate that 
they must rebound back along the lines of approach with the same velocity 
as that of approach. But it is easily shown that if two of them collide 
ever so little sooner than these two collide with the third, the resulting 
motion will be decidedly different in character. Similar difficulties of in¬ 
determinacy occur when three or more mass particles, attracting one 
another according to the Newtonian Law, collide simultaneously. 

Another difficulty ( 2 ) is that of excessive collision velocities. Suppose, 
for example, that two steel spheres approach one another with a relative 
velocity which exceeds twice the velocity of sound in steel. In this case 
the usual elastic theory will no longer avail to follow the motion after 

214 


864 



collision, inasmuch as the disturbance created by the impact cannot be 
propagated sufficiently rapidly. 

A kindred difficulty ( 3 ) is that of a tendency towards disorganized 
motion, which is usually ignored in dealing with fluids or elastic bodies. 
Suppose that a perfectly elastic bod}* is set in a simple state of vibriation. 
Then, except in rare cases, the motion will tend to become more and 
more irregular in microscopic character, although the total kinetic energy 
remains fixed. Evidently the resulting state of disorganized motion involves 
fundamental difficulties of observation and prediction. This difficulty is 
sometimes thought of as a tendency towards indefinite increase in frequency 
of vibration (e. g. the so-called ‘violet catastrophe’). 

In the early development of electro-magnetic theory there was used 
the model of an electro-magnetic ether. This involved the concepts of 
absolute space and time. It was found, however, that the space and time 
which belonged to this model was not that called for by the physical facts. 
It was the recognition of the difficulty ( 4 ) in the framework of classical 
space and time which led, of course, to the space-time of the special theory 
of relativity. 


3 . Some Mathematical Difficulties in Relativistic Physics. 

With the discovery by Einstein of the special theory of relativity, the 
difficulty ( 4 ) disappeared of course, and a deeper understanding of the 
electromagnetic equations of Maxwell-Lorentz was obtained. However, for¬ 
midable new difficulties arose in its place, of which I will only mention two. 

If we attempt to use the notion of a ‘point charge’, it immediately 
appears that the difficulty ( 5 ) of infinite available energy will arise. 

Furthermore, if we regard electricity as attached to spatially distributed 
matter, the enormous repulsive forces thereby introduced would lead to 
the difficulty ( 6 ) of instability. In connection with the difficulty of instability 
it is to be borne in mind that, with the space-time of relativity, it has not 
been possible to find a satisfactory counterpart of the rigid body or elastic 
solid, by the aid of which it was possible to secure stability for a charged 
body in the earlier theory. 

Finally, I will mention another difficulty, present when we use either 
the framework of space and time of the special theory of relativity or that 
of classical physics, namely the difficulty ( 7 ) of the non-existence of simple 
neutral states. In fact, according to these theories, positive and negative 
electricity can never be superposed and so electric fields must always exist. 
Nevertheless in Nature there is observable a general tendency towards 

215 


865 



neutralization. From the philosophical point of view it appears to me likely 
that such neutralization can occur. 

The difficulties (i) to ( 7 ) seem to me the essential difficulties of mathe¬ 
matical type which have arisen in the use of conceptual models in physics. 
I believe that physical theorists in the past have paid far too little attention 
to difficulties of this type. 


3 . A General Criticism of Quantum Mechanical Theories. 

Current theories of quantum mechanics seem to me to be subject to 
the serious criticism that they involve improvisations and modifications, and 
thus lack in unitary character. It is as if an astronomer, knowing only the 
laws of Kepler governing the motion of two bodies, were to feel his way 
skiffully towards calculating orbits in the three body problem presented by 
the Sun, Earth, and Moon. He would undoubtedly succeed because the facts 
of observation would guide his formal conjectures; but his work would not 
be likely to lead him to the Newtonian law of gravitation since he would 
be wrong from the start. 


Part III. 

A Quantum Mechanical Model. 

1 . The Perfect Fluid. 

We proceed now to develop a conceptual model which may possibly 
be of service in connection with quantum mechanics. In seeking such a 
model, it seems natural to take at first the space-time background provided 
by the special theory of relativity; this merely amounts to dealing at first 
with matter in small quantity. The phenomenon of gravitation may be incor¬ 
porated basically later by methods of general relativity due to Einstein. 

The simplest form of matter available in such a background is that of 
‘inchoate matter’, characterized by a single scalar quantity g, the density. 
If we imagine electricity of density o in electro-static units attached to such 
matter, the usual pondero-motive laws together with the Maxwell-Lorentz 
laws of the electro-magnetic field provide us with a complete mathematical 
theory. Unfortunately such a distribution of charged inchoate matter will 
be highly unstable due to electric forces of repulsion. 

As a first generalization of such inchoate matter, it is natural to in¬ 
troduce a pressure p which is functionally related to the density g so that 
P =/(&)■ In this case, however, the difficulty of excessive disturbance velo¬ 
cities arises if the disturbance velocity in such a fluid is less than that of 

216 


866 



light. In fact, by the very concept of such a fluid, its parts cannot freely 
interpenetrate. Hence, if two portions were to collide with a relative velo¬ 
city exceeding twice the disturbance velocity, the difficulty ( 2 ) of excessive 
collision velocities would enter. It is immediately suggested, therefore, that 
we require the disturbance velocity to equal that of light. This leads us at 
once to take the following equation: 


as the special form of the pressure-density relation in such a 'perfect fluid ’. 1 
Here q 0 is the density in equilibrium under no pressure. 

A fundamental property of this perfect fluid, under the action of any 
forces whatsoever, is that the ratio q/o* is constant along every world line 
of the fluid. This follows from the fact that the quantity of matter as 
measured by 


( 2 ) 



dv 


(dv = dx dy dz ) 


in rest co-ordinates x,y, z, and the quantity of electricity as measured by 


( 3 ) [odv, 

are both conserved. We shall write this ratio in the following form 


( 4 ) 



The scalar rp will be called the ‘substance coefficient’ ol the world line 
in question. 

Such a perfect fluid carrying electricity would not constitute a suffi¬ 
ciently flexible model for our purposes, for it too possesses a strong tendency 
towards instability. However, except for the difficulties ( 6 ) of instability 
and ( 7 ) of the non-existence of neutral states, the fluid would be satisfactory 
from a mathematical point of view. If, further, we provide for the free 
interpenetrability of positively and negatively charged matter (proton and 
electron), the second of these difficulties disappears. This we now do, so 
that the main task lying before us is that of securing the suitable stable 
character of positively and negatively charged portions of the perfect fluid 
when superposed. 

Inasmuch as such instability can be avoided in classical physics by the 
use of the rigid or elastic body as the carrier of electricity, it is immediately 


1 We employ the second, gram, and light-second as units of length, mass, and time. 


217 


867 



suggested that some further extension of the concept of the perfect fluid 
must be made in virtue of which it somewhat resembles the rigid or elastic 
body in its behavior. 

Let us write down the complete set of equations for the perfect 
fluid as so far defined. These are the usual electro-magnetic equations in 
tensor form. 

^=-4W {? =d -Ts' Fl1 - F >) 

(s) 4. dF ‘ k + lUi — o 

dx k + dx, + dx, 0# 

and the pondero-motive equations 

(6) ?Il =aF ^, J r i/ = e ^„/_I (e _ eo) ^. 

Here T 11 is the usual energy tensor, and g 11 denotes the fundamental 
(contravariant) tensor. 


2. The Atomic Potential. 


In 1926 (loc. cit.) I generalized the perfect fluid, by introducing certain 
further body forces due to what I termed an ‘atomic potential’. The notion 
of atomic potential was suggested to me as the most natural and simple 
formal extension of the perfect fluid. The idea may be presented in the 
following manner. The force vector fi due to the electro-magnetic forces 
is given by the equation /^FaV*. In virtue of the skew-symmetry of 
Fit, this force vector // is automatically orthogonal to the velocity vector 
v l , as any force vector must necessarily be. Therefore, in seeking for avail¬ 
able body forces, we look for other forces, which in virtue of their formal 
nature are orthogonal to the velocity vector. The simplest force of this 
description seems to be that with components Ji=dxpldx l where y is the 
atomic potential in question, constant in value along each world line; for, 
in consequence of this constancy we have necessarily the orthogonality 
desired, 


( 7 ) 


d y 
d x a 


v°=0. 


Thus we adjoin to the electro-magnetic forces and the pressional forces, 
the force dy/dx* given by the atomic potential gradient. 

The energy tensor than generalizes to the form 


(8) |j(p—pj—yj g". 

The equations (5) and (6) continue to hold as before. 


218 


868 


3. Choice of Substance Coefficients and Atomic Potentials. 

If the scalar densities q± and o±, x the velocity tensors v±, the sub¬ 
stance coefficients tp±, the atomic potentials \p±, and the electro-magnetic 
force tensor Fu are given at any instant of time, it is clear that the values 
of these are determined for all time by the equations (5), (6), and (7), 
written above. 

Furthermore, if at any instant we have superposed fluids in the neutral 
state with common velocities, so that o + + o_= 0 , =v ‘_, Fi /—0 every¬ 

where, and if the squared substance coefficients and atomic potentials in the 
superposed fluids are in a fixed constant ratio, namely, 


( 9 ) 





tan* 6 


(0, a constant), 


then it is immediately seen that this neutral state will continue indefinitely. 
In particular if the initial velocities are taken to vanish, we obtain a static 
neutral state. 

We assume that the atomic system formed by proton plus electron admits 
such a static state, with constant electrical densities k and — k , and possessing 
spherical symmetry. These hypotheses in conjunction with (9) yield at once 
the equations 

(10) 

' q> + =<psec6, 9>_=9>csc0, 


with <p an auxiliary function defined by 


(11) 


+ 4- 

<p <r+ <r- 



while <p 0 =/(r 0 ), where r 0 designates the atomic radius. 

In order to complete the characterization of our fluid we suppose 
further that the arbitrary function f(r) of the radial distance r is quadratic 
in \/r: 


(12) 




Here a, /?, and y are constants at our disposal. 

Finally we have to fix the radius r 0 of the atom. Now fiYa has the 
dimensions of reciprocal distance according to our theory. We may write 
therefore 


1 The subscript + or — is used according: as reference is made to a positively or a 
negatively charged portion of the perfect fluid. 


219 



(.3) 


X 


r ° era 

where X is a dimensionless constant. We shall take X to be lafge. The 
quantity 1 Ifilfa is taken to be exactly the Bohr radius ajc (in our units), 
so that the outer ‘electrical radius’ r 0 of the atom is large in comparison 
with the inner ‘mechanical radius’ ajc (Bohr radius). We assume that the 
interpenetrability of protons with other protons and of electrons with other 
electrons ends at the mechanical boundary r—ajc. 

This completes our list of specializing assumptions. It is to be ob¬ 
served that the proposed theory involves five constants 

( 14 ) a,p t y t B,k. 

The number X is large but unspecified. These five constants are, however, 
connected by one (arbitrary) relationship 



The situation here is to be compared with that in usual physical theory 
where we have the five constants. 

m, M,e % h,a 0 (in C. G. S. units) 

connected by one (arbitrary) relationship 

(tn + M)h' 

°° 2 7i m Me' 


( 15 ) 


The remaining constants of our theory can be expressed in terms of the 
four independent constants m,M,e,h as follows: 


( 16 ) 


m c % 



P 2ne' /2w*V . 81 /2w«V 

vs"Tr* r -hr) + iiFhr)* 


tan 0 = 



It is very interesting to observe that the quadratic form 4 7i'lrp' is 
positive definite and differs very slightly from a square so that 2nlrp itself is 
with high accuracy given by the linear form 

nt c % _a , 

h r * 


Furthermore, nearly all of jthe mass of the atom is concentrated in a narrow 
spherical ring at a radial distance of about a*ajc ; therefore, the con¬ 
centration of mass would occur near the surface of the nucleus. 


1 Here a denotes the fine-structure constant. 

220 


870 



4- The Atomic Frequency Equations. 

As I showed in my 19126 paper, it is easy to deduce the equation for 
small vibrations of a superposed proton and electron of the type specified. 
The three linear frequency equations which result turn out to be 


(17) 




2^ £ /a_£ ay 

<p r U* + + 

2^ 2 /a_^ a_y az\ 

<p r\dx a y dz) 

Z+12L1IIX ay az\ 

<p r{dx dy dz) 


= 0 

= 0 

= 0 


where A denotes the ordinary three-dimensional Laplacian operator and 
where Xe^', YeZe*> t denote the small components of the electrical 
forces within the atom in the direction of the x,y, e axes respectively, of 
period 2 nip. .In addition to these linear differential equations, which hold 
within the atom, there are three similar equations 


(18) jx+p'x— o, jy+/y= 0, AZ + p 7 Z = 0 


holding outside of the atom. Of course one has to impose the further 
conditions that the electrical and magnetic forces are continuous across the 
boundary of the atom. 

Now closer consideration indicates that if the dimensionless constant /. 
is large (that is, the electrical radius of our atom is large in comparison to 
the Bohr radius), we way approximate to the above two linear systems by 
taking the first set of equations to be valid throughout space. We then 
seek those characteristic values p for which there exist characteristic solutions 
X , Y,Z of the frequency equations (17) vanishing at infinity. This problem 
is a homogeneous linear boundary value problem of classical, although not 
self-adjoint, type. 


5. The Corresponding Characteristic Values and Functions. 

We proceed to sketch some of the facts concerning the solution 
of this boundary value problem. Let us first obtain all solutions of the 
special form 

(19) X=yF(r,z), y= — x F{r, e), Z= 0 . 

It will be seen that for these the total electrical density vanishes every¬ 
where and our three equations are replaced by a single equation of the form 

221 


871 



(X=yFW,z)). 


[ J+ (''-?■')] Jf -° 

In the explicit determination of F, the Laguerre polynomials enter 
just as they do in the SchrOdinger equations for the hydrogen atom, and 
one finds that, for solutions vanishing at infinity, we have 

0—i+ViT+F+a. 


where k and / are integers with /> 1 . 

If now we write n = /+k + I , and solve for p to terms of the second 
order in y (essentially the square of the fine-structure constant a) we obtain 


[ 20 ) 


pm=± (a — 


/ . t*y\ 1 l 1 \ 

( 2»i ,T 2»'[/+i 


for 1 <J/<w. This would yield the correct fine-structure formula only if 
/+ A be replaced by /. 

So far we have referred to the solutions of the special form (19) ot 
solutions, which all satisfy the condition 


(21) 


9 X 9 Y 9 Z 
dx + dy + 9 s 


and are specially related to the s-axis. But the totality of all solutions (21) 
forms a linear family invariant under the group of rotations. An elementary 
group theoretic discussion shows that each characteristic value ±p„t is of 
multiplicity precisely 2(2/+ 1), inasmuch as the specialized boundary value 
problem is self-adjoint and the invariant linear family consists of 2/+1 
linearly independent functions. 

It remains to consider the rest of the characteristic values of our 
boundary value problem; namely, those for which (21) fails to hold. In doing 
so it is found convenient to pass to the corresponding set of equations in 

3 .Y 9 Y 9 Z 

+ + ? 7 * —*x+yY+*z. 


The new characteristic values p\i. near to p n i, are found to exist for 
/>0 and are of multiplicity 2(2/+1). A single radially symmetric solution 
exists for n= I with a simple characteristic value p l0 . However the modified 
fine-structure formula (20)’ is not the correct one obtained from (20) by 
replacing /+ A by l . Our 'conclusion is that a slightly different choice of <p 
is required; if, for example, zve replace y by — 0 ]a in (12) the desired fine- 
structure formula is obtained, although this choice is not satisfactory for 
other reasons. 


222 


872 



6. Physical Interpretation. 

With the choice of the fundamental constants already specified in (16), 
the frequency formula (20)* gives in the first approximation the usual 
Balmer frequency formula, provided that we accept the Planck-Einstein law. 
Furthermore, the multiplicity of each state is the required by the physical facts. 

But the actual frequencies obtained from our equation are all close to 
nt c*/ 2 71 h , and so correspond to energy changes of the order of a million 
electron volts according to the usual theory. Let us suppose, as it seems 
very reasonable to do, that for such high frequencies as this, energy is 
radiated very slowly from our oscillating atom. Furthermore since the 
characteristic functions belonging to the characteristic number p„t yield zero 
electrical moment (w = 0) we may well suppose that the corresponding radi¬ 
ation can be ignored. 

Inasmuch as the asymptotic formulas for the characteristic functions 
belonging to p*,,i are expanded in ascending power series in n/X , it seems 
likely that the corresponding radiation takes place more rapidly the larger 
the value of n. For these the electrical moment is not zero. 

Now the problem from which we started was not a linear problem, such 
as is afforded by the usual SchrBdinger equation. Hence we need to 
consider the higher order corrections to the simple harmonic vibrations of 
the first order Ae- 1 "*. The second order corrections are of the form 
g e l±/>\±h' it and fall into two classes. The first class is that in which the 
two signs ± agree; this possibility gives new high frequencies, which again 
we take to radiate energy slowly. The second class is that with opposing 
signs; namely, Be ±{f,l ~ f,%),t . 

Thus the desired difference frequencies, hitherto obtained by an appli¬ 
cation of the Planck-Einstein 'law', enter naturally here and would seem to 
be precisely the ones at which the energy is radiated. 

Evidently the picture presented in this way is highly suggestive and 
deserves careful further study, although the study is certain to be one of 
considerable mathematical difficulty. 

7. On Atomic Stability. 

The atom thus defined is endowed with a possibility of stability inas¬ 
much as the perfect fluid used is not physically isotropic. However, in the 
theory as formulated so far there would only be a semi-stability in the 
static neutral state, since the superposed fluids would be completely plastic 
although nearly invariant in volume. 1 

1 For a critical discussion of stability for the first form of my theory (see ’ p. 207) cf. 

T. L. Smita (Harvard Dissertation, 1931), The Birhhoff Fluid Theory of Electricity. 

223 


873 



I propose the following as a simple postulate designed to ensure such 
stability to the necessary extent: the mechanical and electrical surfaces of 
the proton and electron always remain convex in virtue of surface forces 
of constraint which are called into play when the curvature is about to 
change sign. 

In consequence of this postulate a convex proton in collision with 
another such proton would necessarily have a plane surface of contact 
with it. Likewise atoms under pressure in the neutral static state would 
take the form of polyhedra, and in all probability a crystalline form would 
result that is geometrically related to that of the elementary polyhedron. 

Under such a law of convex form one would expect the proton and 
electron to remain approximately spherical under the action of random 
forces of collision or electro-magnetic forces of the magnitude to be expected 
in the gaseous condition. Furthermore, because of the laws of conservation 
of energy and momentum, and the small compressibility of the atom, it 
would appear that the ordinary kinetic theory of gases would be applicable. 

8 . On the Family of Elements. 

From the point of view taken above it is natural to define the neutron 
as an uncharged portion (o = 0) of the perfect fluid, with the same atomic 
potential ip + in the spherically symmetric state as the proton. Similarly the 
positron would be defined just as the electron, except that the charge den¬ 
sity is taken opposite in sign. As for the photon, it might be defined as 
a point particle carrying an arbitrary ‘energy’ and moving with the velo¬ 
city of light, which obeys the ordinary relativistic laws for conservation of 
energy and momentum when it collides with a proton, electron, neutron, 
or positron. 

We may revise also the usual requirement that separation of colliding 
fluids takes place when the pressure vanishes (^ = 0). Instead let us suppose 
that separation takes place when the pressure taken on a certain critical 
negative value p=p 0 <0 , characteristic of colliding bodies (as proton and 
neutron). Evidently if we suppose that a small critical tension exists between 
proton and proton and a large critical tension between proton and neutron, 
we may expect isolated neutrons to be rare, and something like a family 
of chemical elements to arise. Furthermore, inasmuch as rest mass is not 
invariant for our fluid, we should find small deviations in the atomic masses 
from integral multiples of fhc mass of the hydrogen atom. 

Such are some of the features of a possible atomic model. In laying 
it before you I do so largely because it seems to me that the possibility 

224 


874 



of a conceptual treatment of matter and electricity on a relativistic basis 
has not as yet been carefully explored. The model of which use is made 
here involves only a natural extension of the simplest type of relativistic 
perfect fluid as the carrier of electricity. Yet it seems to satisfy many 
of the mathematical and qualitative requirements which are desirable. What¬ 
ever be the fate of this particular model, I hope that it will provoke a 
more thorough-going study of the conceptual possibilities. 


875 



Reprinted from Journal of the Franklin Institute, Vol. 226 , Sept., 
1938 , pp. 315 - 325 . 


ELECTRICITY AS A FLUID.* 

BY 

GEORGE DAVID BIRKHOFF, Ph.D., Sc.D., 

Harvard University. Cambridge. Mass. 


As all the world knows, Benjamin Franklin was one of the 
great early figures in electrical research and, in particular, 
established the electrical nature of lightning by his celebrated 
kite experiments. In the course of these investigations he was 
led to conjecture that electricity is a kind of fluid. More 
precisely he thought of a positively charged body as carrying 
more of this electrical fluid than a negatively charged body. 
Thus the neutral condition arises when electricity is equally 
distributed in material bodies. Such a one-fluid theory has 
an obvious disadvantage: since electricity of one sign repels 
itself, there should be a force of repulsion between the parts of 
any body carrying a charge, and no such repulsion is found in 
the neutral state. 

P'ranklin’s investigations of the nature of electricity formed, 
of course, one of the instances of his remarkable versatility; 
and it is interesting for a mathematician like myself to recall 
his ingenuity in devising magic squares, showing that he was 
not only curious about the laws of Nature, but also about the 
fascinating properties of numbers. 

The theory of electricity which I propose to describe today 
is a two-fluid theory and yet resembles in a certain sense the 
one-fluid theory of Franklin more than it does the final form of 
the two-fluid theory which held sway until the beginning of the 
present century. In fact according to the theory offered here 
positive and negative electricity may freely interpenetrate to 
produce a true neutral condition. According to the usual 
two-fluid theory, however, elementary positive and negative 
electrical particles cannot interpenetrate, so that a true 
neutral condition is impossible. 

The subject of my talk will appear out of date to most 
physicists. For this reason I must begin by indicating a few 

* Presented Friday morning. May 20, 1938. 

3*5 


876 



George D. Birkhoff. 


[J. F. I. 


316 


of the considerations which have interested me in it. Roughly 
speaking, until about the year 1900 a background of absolute 
time and of space filled with an electro-magnetic ether, 
together with continuously distributed electrified matter, 
formed the basis of attempts to explain physical phenomena. 
In this effort free use was made of such conceptual models as 
rigid, fluid, and elastic bodies. 

It was realized by the end of the last century, however, 
that there was something essentially inappropriate in the use 
of such models; and the new era of special relativity opened 
by the work of Lorentz and Einstein showed why this was the 
case. But there were certain attendant disadvantages. In 
particular the concept of a rigid atom carrying an electric 
charge and yet obviously stable, had to be abandoned since no 
relativistic substitute for the rigid body was found. Thus 
through relativistic advances there arose a general scepticism 
as to the validity of previous concepts concerning matter, and 
physicists were led to conjecture that all mass was electro¬ 
magnetic. 

Meanwhile the famous quantum hypothesis had been 
enunciated by Planck in 1900. According to him, energy was 
radiated from the atom only in discrete quantities or quanta, 
and not continuously as had been thought. This hypothesis 
led to the formulation of the so-called Einstein radiation law 
by Einstein in 1905 and to the Bohr theory of the atom in 
1913. In this last theory the protons and electrons were 
thought of as point masses carrying electrical charges + e and 
— e, the light electron rotating around the heavy proton in an 
elliptical orbit due to the forces of electrical attraction. 
Bohr’s startling ad hoc hypothesis that the electrons can jump 
discontinuously from one stationary state to another, con¬ 
tributed to make the notions of space, time, and matter more 
insubstantial than ever. 

Then came the present theory of quantum mechanics 
beginning with Heisenberg’s and Schrbdinger’s celebrated 
work of 1926. Although Schrodinger’s theory takes its start 
in the usual model of the atomic system, this is finally dis¬ 
carded as a kind of preliminary scaffolding. It is only the final 
Schrodinger equation, determining the energy levels and so the 
atomic frequencies, which is regarded as fundamental. 


877 



Sept., 1938-J 


Electricity As a Fluid. 


317 


What has been the consequence of this theory? Simply 
that somehow or other the mathematical physicist has 
constructed a remarkable bridge which enables him to pass 
from a classical picture of the atomic system to highly accurate 
formulae for the positions and intensities of the spectral lines of 
the chemical elements. Thus it has become possible at last to 
describe spectroscopic phenomena and to branch out into 
related fields. Inasmuch as physicists have long regarded the 
spectra of the various elements as affording the true key to 
atomic structure, it is easy to understand why the opinion 
gained currency that the ultimate secret of atomic structure 
was about to be revealed. 

A decade has gone by, however, and much of this first 
optimism has disappeared. Many physicists would concede 
today that theoretical physics is in need of basic revision. 
One reason is that many of the hypotheses which have been 
employed in the development of quantum mechanics seem 
ad hoc. In my opinion, the present state of quantum me¬ 
chanics is somewhat analogous to that of theoretical as¬ 
tronomy after Kepler had successfully described the motion of 
two bodies. It would have been possible then to arrive at the 
details of lunar and planetary motions as arising from a slight 
disturbance of Keplerian motions. These disturbances obvi¬ 
ously involve the fundamental periods of the constituent two- 
body problems, such as the solar year,, the lunar month, etc. 
Then by use of suitable “selection rules” and similar devices, 

I feel confident one could arrive at a fairly satisfactory 
descriptive theory. Very fortunately, however, Newton 
formulated his law of universal gravitation and so rendered 
tentative mathematical work of this sort unnecessary. Per¬ 
haps some similar new conceptual outlook will shortly replace 
the present abstract and artificial point of view of quantum 
mechanics. 

May it not be that the full exploitation of relativistic 
electro-dynamics will reveal possibilities which have been 
overlooked ? This is a field which has never been investigated 
adequately, because the successes of Planck and Bohr con¬ 
vinced the physicists that the universe about us is as discon¬ 
tinuous in its essence as the set of positive integers. In other 
words there has ensued a state of mind prejudiced against the 

VOL. 226, NO. 1353—23 


878 



George D. Birkhoff. 


fj. F. I. 


318 


use of continuous space and time as fundamental instruments 
of thought. One can scarcely help recalling similar decisions 
of the past, first in favor of Newton’s corpuscular theory of 
light, then in favor of the alternative wave theory—both of 
these being now replaced by an uncertain combination of the 
two theories which is adopted with almost no dissenting voice. 

Whether or not relativistic electro-dynamics will ulti¬ 
mately play a fundamental role is highly conjectural. But I 
feel confident that a systematic study of this neglected field 
needs to be made, if it serves only to establish the true 
limitations of the conceptual models which it furnishes us. If 
the attempt to use such means fails, I know of no other 
possibility for a conceptual understanding of the physical 
universe. 

Immediately after the first formulations of the new 
quantum mechanics I presented in 1926 a first attempt of this 
kind before the American Association for the Advancement 
of Science at its Pittsburgh meeting. What I wish to talk 
about today is a somewhat improved theory which was given 
by me before the International Congress for Mathematicians 
at Oslo in 1936, although I have not been able as yet to work 
out all the details in satisfactory form. 

In this theory my first remark is one which concerns the 
role of gravitation, namely that gravitational effects may be 
omitted entirely from consideration. In fact if but little 
matter is present in space-time, gravitational forces become 
practically negligible and the relativistic background becomes 
that of the flat space-time used in the special theory of 
relativity. Now if one writes down the laws of motion for 
such a universe, inclusive of the electro-dynamical equations, 
and interprets the rectangular coordinates used as “geodesic 
coordinates” in the generalized theory of relativity, one 
arrives at modified laws in which gravitation is basically 
incorporated. In other words we can omit or include gravi¬ 
tational effects with the same facility as in the Newtonian 
theory of gravitation where gravitational forces are merely 
superadded. We are, therefore, justified in confining atten¬ 
tion to the space-time of the special theory of relativity. 

Let us now suppose a continuous distribution of inchoate 
matter having a density p. Such a distribution would be 


879 



Sept.. I93« l 


Electricity As a Fluid. 


319 


approximately realized by a cloud of cold nebular dust remote 
from all other matter. For such an over-simplified universe a 
complete mathematical theory is possible; and to a first 
approximation the final result may be stated in the form that 
each particle travels with uniform velocity in a straight line 
except as deflected by the usual gravitational forces of 
attraction exerted by the other particles. 

An interesting extension of this model is obtained if we 
imagine positive or negative electrical fluid of density a to be 
attached to such inchoate matter. But I do not know of any 
physical situation in which a universe of this kind might be ap¬ 
proximately realized in ordinary experience. Adjacent por¬ 
tions of such charged matter would be mutually repelled by 
enormous forces of electrical origin and so would tend to 
recede from one another at high velocities. On the other 
hand, there would be a strong tendency for oppositely charged 
portions to approach one another due to attractive forces and 
to become neutralized by super-position. Thus there would 
soon be attained a nearly neutral state much like that of 
uncharged inchoate matter. Here too the mathematical 
theory would be complete, although the classical electro¬ 
dynamic equations of Maxwell and Lorentz now come into 

piny. 

This seems to be about as far as we can go without intro¬ 
ducing “non-electrical forces.” Three or four years ago I 
should have felt somewhat embarrassed to introduce such 
forces, but today physicists are once more admitting that such 
forces need to be used in order to account for the observed 
facts. At a meeting of the American Philosophical Society 
recently held in this city, Dr. Karl Darrow has alluded to this 
change in attitude. A propos of this, I may remark that it 
has always seemed to me impossible to develop logically the 
fundamental equations of the electro-magnetic field except by 
taking in from the outset ideas of pondero-motive force, of 
space-time, of mass, etc. 

Let us, therefore, search for a generalization of the notion 
of the rigid body, the elastic body or the fluid which is ap¬ 
propriate to the special theory of relativity and which is as 
simple as possible. Up to the present time no one has found a 
satisfactory generalization of the rigid or elastic body. The 


880 



320 


Georoe D. Birkhoff. 


IJ. F. I. 


reason for this failure is obvious. In both cases we have a 
natural shape of the body when its parts are relatively at rest 
The possibility of such a natural shape arises through the tact 
that tensional forces may be introduced which change as the 
Euclidean distance of nearby points changes. But in special 
relativity, distance in the ordinary sense has lost its meaning ; 
for example, the space-time distance between a point emitting 
light and a point on the out-going light wave a thousand years 
later is still zero. Our only chance, therefore, turns out to be a 
generalization of the adiabatic fluid, which can be easily 
accomplished as follows. Let us imagine again a continuous 
distribution of matter of density p; but let us suppose that 
such matter is subjected to a certain body pressure p which is 
functionally related to the density, p = /(p). In such a fluid 
the parts would tend to move as in the case of inchoate matter 
except insofar as the pressure gradient yields additional body 
forces. Such a relativistic fluid is used by Lemaitre in his 
theory of the expanding universe. Mathematically speaking, 
this is a very natural generalization, in that the energy tensor 
T iJ = pv'V of inchoate matter is replaced by 

T" = pvV - pg { *\ 

here v* is the velocity tensor of matter and g iJ is the funda¬ 
mental gravitational tensor in contravariant form. We are 
assuming for the moment that the fluid carries no electric 
charge. 

And now I must mention a routine of technique which is 
fundamental for me in such mathematical model-making: I try 
systematically to avoid various theoretic difficulties which 
have arisen in the past. A particular difficulty of this kind for 
an adiabatic fluid is the following. Such a fluid has a natural 
disturbance velocity analogous to the velocity of sound in a 
gas. Now if one were to constrain a rigid body immersed in a 
gas to move suddenly in any direction at a velocity greater 
than that of sound the ordinary mathematical theory would 
break down. A similar state of affairs holds in our relativistic 
fluid. More precisely, if two parts of the fluid collide with a 
relative velocity more than twice the disturbance velocity, the 
theory fails. Consequently we get into difficulty unless the 
disturbance velocity of the fluid is at least as great as that of 



Sept., 1938.] 


Electricity As a Fluid. 


321 


the maximum velocity of the body as a whole, namely the 
limiting velocity of light. On the other hand, it would seem 
absurd to suppose that the disturbance velocity can exceed 
that of light. Thus the demands imposed by the mathe¬ 
matical point of view seem to require that the disturbance 
velocity is always that of light. This leads to the conclusion 
that the fluid in question must be what I have called the 
“perfect fluid,” with constitutive equation p = $(p — p 0 ) 
where p 0 is the density under no pressure. Here the units of 
length and time are taken as the light-second and the second 
respectively. 

A neutral portion of such a perfect fluid tends to expand 
when its density is greater than that of the natural density 
Po, and it tends to contract in the contrary case. As time 
elapses the fluid would tend to assume a highly irregular 
fluctuating shape, wisp-like but with finite total volume. If 
considerable masses of the fluid were contained in relatively 
small volumes, gravitational forces would enter and the bodies 
might take an approximately spherical form and move in 
various orbits through space. It is clear that this affords an 
interesting model from the astronomical point of view. 

From now on we shall use such a perfect fluid as a carrier of 
electricity and will suppose that the electric fluid of density a 
is attached to this form of matter. If a is positive, such 
matter is positively charged ; if <r is negative, it is negatively 
charged; and if a vanishes the matter is neutral. We envisage 
first the case in which the electrical charge is everywhere of one 
sign. According to the electro-magnetic equations the total 
quantity of electricity can never change; from which it results 
that the ratio of p to a 2 is a constant along the world-line of any 
particle of our fluid; this corresponds to the fact that the 
natural mass of any part of the perfect fluid will not be 
strictly invariable. This ratio <p is called the substance 
coefficient. 

In such a universe matter would tend to expand indefinitely 
due to enormous electrical forces of repulsion, but nevertheless 
the total volume would remain finite. There would take 
place a rapid dispersal of the electrified matter throughout 
space, so that the parts recede from one another in wisp-like 
forms. If distinct portions of the fluid happen to collide at 


882 



322 George D. Birkhoff. u ' r ' *' 

any velocity (less than that of light) no mathematical difficulty 
arises due to the extraordinarily resilient character of the fluid 
alluded to above. However, in order to make the theory 
absolutely complete, definite laws would have to be formulated 
as to what happens after collision. For instance the bodies 
might separate again freely when the tension reduces to zero; 
or might stick together permanently as a unitary fluid; or 
might act in an intermediate manner. For the moment we 
omit formulation of such laws of contact. 

When we suppose that positively and negatively charged 
matter, and even uncharged matter, are present, the situation 
is somewhat more complicated. If we adhere to Franklin’s 
notion that electricity can be uniformly distributed in space, 
we must admit the free interpenetration of positively and 
negatively charged matter. For the moment we do not 
specify any other rules governing the behavior of bodies which 
come into contact. In such a universe it is clear that posi¬ 
tively and negatively charged portions of matter would tend to 
become superimposed so as to neutralize electric charges. 
Furthermore, until such neutralization is effected electro¬ 
magnetic radiation of energy into space will continue. How¬ 
ever, the resulting final neutral condition would seem to be 
entirely amorphous. Our next step, therefore, is directed 
towards securing something like a stable atom. 

In order to achieve this goal, I again generalize the perfect 
fluid by assuming that there exists certain further body forces, 
attributable to what I have called an “atomic potential,” 
These are forces which constantly act in the direction of the 
gradient of being proportional to this gradient. The 
atomic potential itself is given once for all at each point of the 
body. Mathematically speaking, this generalization amounts 
to replacing the energy tensor written above by 

T iJ = pvV — (i(p “ Po) — 

We have now specified the complete list of constitutive 
equations for the electrified fluid in space-time. 

It remains to determine the arbitrary functions at our 
disposal. These are the substance coefficients and the 
electrical densities which are regarded as given primordially. 
Following what is usually attempted in physics, it seems best 


883 



Sept., 1938.I Electricity As a Fluid. 323 

to begin by specifying these functions for the hydrogen atom 
in static neutral condition. When this state of equilibrium is 
disturbed it is then to be expected that some kind of radiation 
of energy will take place and our guide in determining the 
nature of the functions at our disposal is that the radiation 
frequencies should be those called for by the known spectro¬ 
scopic facts. Our first supposition will be that in a static 
neutral state we have spherical symmetry with the same 
constant electric density =fc k throughout the proton and 
superposed electron. All that remains to be specified is a 
single arbitrary function of the radial distance. Our special 
hypothesis is then that the squared substance coefficients and 
atomic potentials (which are ajl in fixed constant ratios to one 
another) are given by the reciprocal of a quadratic expression 
in i/r where r denotes radial distance. This turns out to yield 
a theory with the same arbitrary physical constants as are 
required by current theories, namely the masses of proton and 
electron, the electronic charge and Planck's constant h. 

Of course the vibrational frequencies of such a hypothetical 
hydrogen atom can be computed by well-known methods. In 
order to obtain the desired frequencies of omission, an 
artificial use had to be made of the Planck radiation law. 
When I proposed my theory of 1926, I saw that this was the 
case, but gave out the theory notwithstanding. 

It appears to me an extraordinary fact that closer exami¬ 
nation has shown other lower, secondary frequencies to be 
present of precisely the type demanded by the Planck ra¬ 
diation law, although it is not yet clear that these will be 
dominant for relatively large disturbances of the atom. Such 
frequencies arise naturally here, precisely because we have to 
do with non-linear wave-equations. On the other hand, such 
difference frequencies cannot arise naturally if we employ the 
fundamental Schrodinger equation, inasmuch as it is strictly 
linear. Furthermore the correct fine-structure formula can 
also be obtained, although certain difficulties of interpretation 
have yet to be overcome. What needs especially to be 
investigated is whether or not the energy of such secondary 
radiation obeys the Planck radiation law. The mathematical 
situation involved is quite complicated, but I hope to complete 
the necessary calculations at an early date. 


884 



324 


George D. Birkhoff. 


(J. F. I. 


There are two serious objections to this model of the 
hydrogen atom. T. L. Smith 1 showed in 1928 that such an 
atom would be essentially shapeless except for the feeble 
influence of gravitation tending towards the spherical form; 
furthermore he pointed out that the small oscillations could 
not all be stable. In consequence I have added a further rule 
requiring convexity at the outer surfaces of the proton and 
electron, as well as at the “mechanical surface” of the proton 
where interpenetrability with other protons is supposed to 
cease; the mechanical surface is taken to be at the Bohr radius. 
This means that when the surfaces in question are about to 
lose their convexity, body forces normal to the surface come 
into play which prevent any concavity. 

Because of this additional law of convexity, it is apparent 
that the proton, whose volume always remains limited, can 
only extend itself in space by assuming a needle-like or disc¬ 
like form. It is obvious that neither form can be maintained 
under a succession of accidental collisions so that a return to 
approximately spherical shape may be expected. It would, 
therefore, seem that such a law of convexity will go a long way 
toward securing the desired atomic stability. 

Similar provisions can readily be made to obtain other 
elementary particles such as positrons, neutrons and photons, 
and to insure their agglomeration into more complicated types 
of atoms. To this end, it may, for example, be supposed that 
a proton and neutron in contact will not separate until a 
certain critical tension p 0 /2 is reached. Inasmuch as this part 
of the theory is as yet highly problematical, I shall not allude 
to it further, except to stress the fact that many possibilities 
are at our disposal. 

My chief aim then today has been to urge the importance 
of further thorough-going consideration of relativistic electro¬ 
dynamics. The special study which I have described above 
was built in consonance with certain purely mathematical 
necessities on the one hand, and qualitative physical con¬ 
siderations on the other. Although I propose to work out the 
consequences of my theory as rapidly as possible, I regard it 
merely as paving the way fo r other similar and probably more 

‘See his Doctoral Thesis, "The Birkhoff Fluid Theory of Electricity,” 
Harvard University, 1928. 


885 



Sept., 1938 ] Electricity As a Fluid. 325 

successful mathematical experiments. Unfortunately, such 
work is not likely to be particularly appealing to the theo¬ 
retical physicists of the present day who would abrogate 
ordinary physical intuition. This seems to me to be 
unfortunate. 

In fact if ordinary physical intuition is to be permanently 
relegated to an inferior position, we must think of the objective 
universe as presenting to our senses an entirely false conceptual 
facade beyond which is vaguely to be discerned the one true 
sanctum sanctorum of quantum mechanics. Let us hope 
that this is not really the case and that our every day com¬ 
mon sense ideas may continue to hold a fundamental place 
among the things of the mind, having long been one of our 
most cherished possessions. I am sure that the great and 
realistically minded Benjamin Franklin, whom we unite to 
honor here, would have wished to retain permanently such 
ideas in their most valid form and to utilize them to the utmost. 


886 



SCIENCE 


Vo I- 97 


Friday, January 22, 1943 


No. 2508 


Sir Joseph Larmor and Modem Mathematical Phys 
ics: Professor Oeouk D. Birkiioff 

IVhat More Can Engineering Colleges Do through 
ESMWT7: Dear Georoe W. Case 


Obituary: 

Harrison EH ell Uowe: F. J. VaR Antwerfen. 
Recent Deaths -- 8 “ 

Scientific Events: 

Joint Council of Scientific Men in Great Britain; 
Committee on Sanitary Engineering of the ho- 
tional Research Council; Presentation of the Mel- 
ehett Medal; elu-ards of the American Society of 
Civil Engineers; The American Institute of Elec¬ 
trical Engineers; Cancellation of the Annual Meet¬ 
ing of the American Physiological Society 84 

Scientific Notes and Netcs 86 

Discussion : 

Agar Hearing Seaweeds at !a s Jolla, California: 

Dr. Rouert II. Thciu dy nnd Du. Marston C. 8ar- 
OENT. Early Mastery of the Group Concept: Pro 
feasor G. A. Miller. IVheat Grains without Em¬ 
bryos. Pkofersor O. A. Stevens 89 

Scientific Books: 

Vertebrate Embryology: pROrESSOR C. E. Me- 
CLUNO. Mathematics: PROFESSOR G. Haley Price 91 


Collaboration between Colleges and ?"<"*”[***! 
and the Department of Agriculture with Special 
Reference to Training Replacements during the 
War 

Special Articles: , , 

Carotene. /. Preliminary Report on Diphcnyla- 
mine as a Stabiliser for Carotene: Kenneth T. 
Williams, Emanuel Hickoff and \N alter Van 
Sandt. .Edema In Pilamln E Deficient Chicks: 
Dr. II. R- Bnu>. Isomorphism and Isotypism 
among Silicates and Phosphates: Dr. Duncan Mc¬ 
Connell 

Scientific Apparatus and Laboratory Methods: 

A Device for Calibrating Small Air Pumps: J. C. 
Owen 

Science News 


95 


96 

99 

8 


SCIENCE: A Weekly Journal devoted to the Advance¬ 
ment of Science, edited by J. McKees CaTTELL and pub¬ 
lished every Friday by 

THE SCIENCE PRESS 

LaDcasler, Pennsylvania 

Annual Subscription, *6.00 Single Copies, 15 CU. 

SCIKNCK Is lbe Official oresn of the American Associa¬ 
tion for lbe Advancement of Science. Information regard- 
Inr membership In the Association msy be secure* from 
the office of the permanent aecretarr In the Hinllbaonlan 
lostliatloo HuilUin*. Washington. O. C. 


SIR JOSEPH LARMOR AND MODERN 
MATHEMATICAL PHYSICS 

By Professor GEORGE D. BIRKHOFF 

HARVARD university 


SIR JOSEPH LARMOR. MATHEMATICAL 
PHYSICIST 

On May 19lh lost the scientific world lost a notable 
mathematical physicist, Sir Joseph Larmor, Lucasian 
professor of innthcmntics nt Cambridge, England, 
from 1903 to 1932, successor to Sir George Stokes in 
this celebrated choir once hold by Sir Isaac Newton. 
After being graduated from Queen's College, Belfast, 
Larmor took highest honors in the Cumbridgc Mathe¬ 
matical Tripos of 1880 at about 23 years of age, J. J. 
Thomson being second wrangler in the same year. 
Larmor was called nt once ns professor of natural 
philosophy to Queen's College, Galway, where he re¬ 
mained until 1895. He then returned to St. John's 


College, Cambridge, as lecturer, and wan named for 
the Lucasian professorship in 1903. From 1901 to 
1912 he was secretory of the Royal Society, nnd was 
awarded the Copley Medal of the society in 1921. 
Always deeply attached to his native country, Ireland, 
he entered Parliament in 1911 ns Unionist represen¬ 
tative of Cambridge University nnd served there for 
eleven years. He received various distinctions besides 
those mentioned. 

Larmor grew to scientific maturity nt a time when 
every attempt was being made to explain all physical 
phenomena on a dynamical or nt least a qunsi-dynam- 
icnl basis, involving the concepts of absolute space 
(the ether), of absolute time nnd simultaneity, of 


887 





78 


SCIENCE 


Vou 07, No. 2508 


mass and force, characteristic of llie Newtonian era 
of physical speculation. Lnrmor was already forty- 
three years of age when Planck in 1900 propounded 
his revolutionary idea of quanta of energy, destined 
to modify profoundly the current of physical ideas. 
In 190') Kinstein formulated his justly celebrated spe¬ 
cial theory of relativity, which was equally subversive 
of accepted classical ideas. 

Up to that time Lannor and bis pre.it compeer, the 
Dutch mathematical physicist, II. A. Loren tz, were 
both well in the van of the group attempting to reeon- 
eilc current dynamical and electrodynamical theories, 
and in particular to explain the null effects of the 
famous MichcUon-Morlev experiments of 1887 and 
other attempts to ascertain the absolute motion of 
the earth in space. It had occurred somewhat earlier 
(around 1892) to another noted Irish mathematical 
physicist, Fitzgerald, ns well as to I>»rcntz. that there 
might occur a minute contraction of length in the 
direction of motion, often called the "Kilzgcrnld- 
Lorcntz contraction," which would explain such null 
results. Hut it was Laruior who, working indepen¬ 
dently in the same circle of ideas, began to under¬ 
take a thoroughgoing mathematical study of the 
whole situation involved. 

The outcome was his great book “Aether and 
Matter" of 1900, which was perhaps his crowning 
achievement. As he states in the Introduction, "a 
complete formal correlation is established between the 
molecular configurations of a material system at rest 
and the same system in uniform translator}* motion, 
which holds good ns far ns the square of the ratio of 
the velocity of the system to the velocity of radiation. 
This correspondence carries with it ns a consequence 
the null result, up to the second order, of the very 
refined experiments of Michelson and Morley. . . 
Beyond the second order of this velocity ratio, Lar- 
mor was not much interested, since such small effects 
would he well outside of the range of experimental 
determination. Indeed, the whole spirit of the work, 
in a mathematical sense, i» extremely close to that of 
Einstein’s later special theory of relativity of 190.'». 
as the above quotation sufficiently shows, even if the 
concept of the ether as a particular frame of refer¬ 
ence is employed. 

Lnrinor goes so far in this book as to speculate 
concerning the explanation of gravitation from the 
same point of view. In a section with the title “Are 
the linear equations of the Aether exactt” (p. 186). 
be asks boldly: "Why then should not relatively minor 
phenomena like gravitation be involved in similar non¬ 
linear terms ... in the analytical S|»cvification of the 
free nether . . At the end he concludes some¬ 
what in favor of the "natural prepossession" that the 
equations are truly linear, and so against his own 
extremely interesting suggestion. 


The now- familiar complete ("relativistic”) form of 
the ponderomotive forces of the electromagnetic field 
and the corresponding Lnrcntz (or (•annor-Lorcntzl) 
transformations are used in this notable work. This 
was essentially completed in 1S98 (sec the Preface). 

It was not until the end of 19U3 that Lorentz’s ii U . 
portnnt article on electromagnetic theory in the Ger¬ 
man Mathematical Encyclopedia appeared. In his 
later article on "Relativity” in the Encyclopedia 
(1920), Pauli has stated the ease thus (section 1, my 
Imndation): 

Now it became Important to work this “ lx»rcnlr con¬ 
traction" organically into the theory, and also to clear 
up the other negative attempts to find an influence of the 
earth's motion ii|«»n phenomena. Here I^irmur is to bo 
mentioned first, who already in 1900 sit up the formulas 
now generally known ns the !.orcntz transformation, and 
thus also had in view the variation of time measurement. 
I^rentz's c>mpr*-hen.ive article which was finished at the 
end of 1903, added some brief comment#, which were to 
show themselves later very fruitful. . . . 

Larmor himself lias said a generous final word of 
appraisal on this question of priority. (Appendix 
II, (1927) "On Relativity and Convection," Vol. 1, 
Mathematical and Physical papers (1929)): 

This transformation was developed for the complete 
scheme of the electric equations of the field ... in Aether 
and Matter (190(1), Chap. XI, but only up to the second 
order of v/e; being so restricted on the tacit ground that 
the finite size of the electrons . . . must in any case intro¬ 
duce uncertainty beyond the order of 10 »* (cm.). . . . 
This complete scheme for the electromagnetic field outside 
the atom ic tourers wns obtained in czaet form indepen¬ 
dently by Lorcntz (1904) . . and the correspondence 
leading to it is appropriately ealled by his name as hav¬ 
ing been the initiator ... in 1892. 

Under the impact of the radical new theories, 
Laruior refused to abandon without further consider¬ 
ation the classical point of view which had served so 
well. He had realized very clearly how dynamics and 
electrodynamics were united deeply by menus of the 
Principle of Least Action of Lagrange and Hamilton 
to which he always attached the grenlcst importance. 
However, there was n definite difference between his 
apologetic attitude for his lack of active mnstcry of 
quantum mechanic* and its spectroscopic applications, 
and his repugnance, not so much towards relativity ns 
for the setting in which it had been presented. Re¬ 
garding the first, be says in the Preface to the second 
volume of his Papers: “the modern constructs in the 
problems of quantified spectroscopy . . . highly suc¬ 
cessful in their special fields . . . have hardly been 
entered upon (by himself), because the vast and ten¬ 
tative literature could not be justly appreciated ex¬ 
cept by a critic closely cognizant of the diverse evolu¬ 
tions of the last fifteen years in this field of knowl- 




J4NUAR* 22, 1943 


SCIENCE 


79 


edge.” But, repurding the second, he speaks of spe¬ 
cial relativity os follows in the Appendix quoted 
above: "All convection, uniform tronslatory motion 
(in free, i.e., empty netherJ, would then be indeter¬ 
minate, there being no standard frame in which to 
locate it . . in no way discredits the theory of 
on aether, unless the intrinsic atoms of matter can 
be abolished also.” Laruior characterizes Einstein’s 
postulate of relativity as on ••algebraic correspon 
donee,” ‘'masquerading in the language of kinematics." 
The second gravitational theory of relativity of Ein¬ 
stein (11115) is epitomized as an “auxiliary construct,” 
while "the absence of space and time and motion in 
the auxiliary construct is against reality." 

It is distinctly interesting to observe that although 
Lorentz followed the brilliant new theories with all 
attention and determination, yet there remained in Ins 
mind vestiges of similar feelings. Thus he states in 
his Leyden Lectures 1910-12 (Silberstcin-Trivclli’a 
translation): "If we do not like the name of the 
•aether' we must invent another name ns a peg on 
which to hong all these things,” and "can not deny 
to the bearer of all these properties a certain sub 
Stnntinlity ; and il so, then one may in all modesty call 
truo time the time measured by clocks which are fixed 
in this medium uud consider simultaneity n primary 
concept.” 

One may sympathize with, and admit the accuracy 
in detail of Lnrmor’s position in regard to the rela- 
tivistic theories: it is true that the work of Fitzgerald, 
Lormor, Lorentz and Poincare hod shown the special 
theory of relativity to be "just around the corner”; 
but it was only Einstein who grasped the significance 
of the actual situation. It is likewise true that the 
general theory of relativity of 1913 has not effectively 
entered into physical speculation; yet this second 
achievement of Einstein has also exerted n consider¬ 
able influence on the course of physical thought. In 
fuct it has been Planck and Einstein together who 
have broken the magical spell which the classical con¬ 
cepts of Newton had cast over scientific and philo¬ 
sophical thinking. 

Lnrmor's mathematical-physical contributions ex¬ 
tend over the entire classical field. His papers are 
extremely thoughtful and always repay careful read¬ 
ing. In the field of electromagnetism his contribu¬ 


tions have been especially influential. Lnrinor s 
formula for the rate of radiation of energy from on 
accelerated electron is well known to all physicists, 
and the dispersion formulas due to him and Lorentz 
have been very useful. 

When occasion required, he brought in subtle 
mathematical considerations, and always had the 
greatest appreciation of purely mathematical work. 
Indeed a number of his papers are essentially mathe¬ 
matical in character. His Presidential Address of 
1910 before the London Mathematical Society, "The 
Fourier Harmonic Analysis and its Scope in Phys¬ 
ical Science," shows a deep intuitive insight into the 
nature of these remarkable series, and a wide knowl¬ 
edge of their extremely varied application#. 

Lam.or was a well-known and much valued figure 
at International Mathematical Congresses, being a 
participant in the Home Congress of 190S and the 
three succeeding congresses nt Cambridge, England 
(1912), at Strasbourg (1920) and nt Toronto (192-1). 

The unmistakable impression which one gathers 
from hi# activities and writings, and from accounts 
by those who have known him, is that of n life of 
absolute sincerity and of unselfish devotion to the 
highest ideal#, scientific and personal, lie had n deep 
sympathy with younger people us witnesses, for ex¬ 
ample, his bequest to the University of Cambridge 
for medical and surgical assistance to the younger 
members of the faculty. 

His own attitude is clearly rcvenled nt the end of 
the first part of his Presidential Address, referred to 
above, where he tries to look forward in the midst 
of the First World War to "the promise of nobler 
and more disinterested times.” Perhaps, in the midst 
of the second World War, we could not close in any 
way that would be more in accord with his outlook 
upon life than in voicing, as he did, the hope ex- 
prosed by Shelley: 

The world's great age begins anew, 

The golden years return. 

The Earth doth like a snake renew 
llcr winter weeds outworn. 

• • • • • 

A brighter llcllas rears its mountains 
From waves n-rcncr far, 

A new Pcneus roll* his fountains 
Against the morning star. 


889 




Reprinted from American Scientist, Vol. 31, Oct., 1943, pp. 281-310. 


THE MATHEMATICAL NATURE OF 
PHYSICAL THEORIES 


By GEORGE D. BIRKHOFF 
Harvard University 


T HE prophetic conjecture that Nature is mathematical 
is one which goes back to Pythagoras and the ancient 
Greeks. The scientific development of the intervening 25 
centuries has only served to establish this conjecture to a 
remarkable degree. The complementary fact that mathe¬ 
matics is natural is, however, just beginning to be grasped. 
On the objective side we have Nature, and, on the subjective 
side, we have its counterpart in mathematics, the language 
of Nature. These two aspects are as intimately related as 
are the two sides of a single coin. 

To bring out the situation it is instructive to think of 
the development of science and language from the genetic 
point of view favored by those who think in biological- 
psychological terms. A child puts its hand too near the fire 
and is burned, and thereafter remembers that this A (touch¬ 
ing fire) will bring about this B (pain and burn). The chain 
of association fixed in his memory is essentially of the propo¬ 
sitional type “A implies B.” He has learned a physical fact! 
Likewise in later situations he will find that not only does 
A imply B but B implies C, etc. The structure of thought 
is such that, by a kind of psychological shorthand, the inter¬ 
mediate term, B, is eliminated; and so he feels that “A 
implies C.” 

The essential genetic foundation here is obvious. The 
mental codification of the facts of Nature in logical and 


281 



282 


American Scientist 


mathematical terms has its origin in the uniformity of Nature 
and of Mind. It becomes clear that, as Dewey has recently 
said, 1 “The basic importance of the serial relation in logic 
is rooted in the conditions of life itself.” 

To carry the same idea further, let us note that each aspect 
of Nature which can be technically isolated has its own appro¬ 
priate mathematical expression. Indeed, the main body of 
mathematics has arisen from three very simple aspects of 
the external world. 

In the first place, our experience with events flowing in 
invariable sequence and recorded in memory leads to logic, 
as has already been indicated. It leads also to the associated 
notion of partially ordered classes or “lattices.” 2 These hier¬ 
archical arrangements of elements are found everywhere, but 
nevertheless only recently has the theory of lattices been 
studied for its own sake. 

Likewise the mental process of putting classes in one-to- 
one correspondence has as its external model a universe 
of distinguishable objects. The boy playing with piles of 
pebbles on the seashore is dealing with such a universe just 
as was the shepherd in ancient times who counted his flock 
by means of stones. In his ability to perform this simple 
process of one-to-one correspondence lies a basic difference 
between man and all other animals. It is self-evident that 
the matching game may be elaborated in various ways. For 
instance, two piles may be combined into a single pile. Prac¬ 
tice in this technique led inevitably to the use of scratches 
and other marks, like the figure 3, for example, to designate 
the pile of reference containing three objects, with which all 
others might be compared. Thus the concept of number arises 
through experience with the universe of classes of identifi¬ 
able objects, just as the concept of logical implication origi¬ 
nates in the universe of ordered events. 

A similar situation holds in regard to geometry. Here the 
universe from which we start is one containing idealized 
rigid bodies in ordinary space. These may be compared with 
one another by the process of direct superposition. Gradually 
in this way there spring up from experience with physical 

1 Logie: the Theory of Inquiry, New York, 1938. p. 35. 

2 See, for instance. Garrett Birkhoff, Lattice Theory, New York, 1942. 


891 



Mathematical Nature of Physical Theories 283 

objects various ideas concerning geometrical relations. Even¬ 
tually the ruler and the protractor are found to serve as 
particular rigid bodies with the aid of which it is especially 
convenient to make geometrical measurements. 

Now it is well known that the vast body of pure mathe¬ 
matics deals principally with logic, number, and geometric 
form. Thus we may assert with reason that the whole struc¬ 
ture of mathematics has been directly suggested by three 
exceedingly simple aspects or models of Nature, namely, 
the universe viewed as recognizable and invariable sequences 
of events, as a collection of classes of identifiable objects, and 
as a collection of rigid bodies. 

With these three exemplars before them, mathematicians 
have set out to generalize and to modify. In the last century, 
the logical field has been greatly extended. Since early times 
other numbers such as zero, fractions, negative and imagi¬ 
nary numbers, and a host of other types have gradually been 
introduced as appropriate extensions of the integer. The 
motivation was not that afforded by any new model so much 
as that of generalization by abstraction, characteristic of 
mathematical thought. Likewise, after the systematic devel¬ 
opment by Euclid and others of ordinary geometry, there 
were invented n-dimensional geometries, non-Euclidean 
geometries, geometries in spaces of infinitely many dimen¬ 
sions and in abstract spaces. 

To the modern mathematical mind, geometry appears as a 
less fundamental discipline than does logic or number. This 
means that one may dispense with the more elaborate model 
based on rigid bodies. The reduction of geometrical thought 
to its basis in number was accomplished largely by Descartes 
and Riemann. More recently the logicians, such as Frege, and 
Russell and Whitehead, have tried to discover number in 
logic itself, and thus to start from the model world of invari¬ 
able sequences of events and nothing more. But the success 
of this ambitious and important venture remains doubtful, 
and number seems certain to hold forever a basic position in 
all mathematical thought. 

The mathematician has created his grandiose mathematical 
structures, stimulated on the one hand by his own scientific 


892 



284 


American Scientist 


imagination, and on the other by various suggestions and 
demands coming through aspects of the external world. His 
successes have truly been amazing. 

In order to understand his method of attack and how it 
can be applied to physical theories, it is instructive to start 
with geometry as a convenient and familiar prototype. It 
will be recalled that Euclid makes use of certain “self- 
evident” truths, taken as undeniable and called axioms. In 
modern parlance these are called “postulates” and are 
thought of as hypotheses. The whole system of such hy¬ 
potheses constitutes the “set of postulates” out of which the 
rest of geometry is to be developed logically as a body of 
“theorems.” The general process of development has been 
twofold. On the one hand, the mathematician broadens the 
field of geometry by abandoning or modifying certain pos¬ 
tulates in what seems to him to be an interesting manner. 
On the other hand, he spares no efforts in order to answer 
unsolved logical questions about geometry. For example, the 
Greeks propounded the question of “squaring the circle,” 
and only after two thousand years of effort was this problem 
solved by Lindemann following in the footsteps of Hermite. 
It is not too much to say that if Archimedes were to return 
to us, a year or two of intense and carefully directed mathe¬ 
matical study would be necessary before he would be able 
to follow the argument involved. 

The provisional and plastic character of sets of postulates 
was not fully understood until recently. The famous state¬ 
ment of Newton: “Hypotheses non fingo”—I do not frame 
hypotheses—indicates not merely that he had relied on his 
method of “Analysis” and subsequent “Synthesis,” but also 
that he had not fixed conscious attention upon various pre¬ 
suppositions involved in his dynamical and astronomical 
theories. Similarly, his great rival, the mathematician and 
philosopher Leibniz, said: “Far from approving the accept¬ 
ance of doubtful principles, I would have people seek even 
the demonstration of the axioms of Euclid .... And when 
I am asked the means of knowing and examining innate 
principles I reply . . . that ... we must try to reduce them 
to first principles, i.e., to axioms which are identical or 


893 



Mathematical Nature of Physical Theories 285 

immediate by means of definitions which are nothing but a 
distinct exposition of ideas.” 

An interesting and characteristic feature is that the postu¬ 
lates of Euclidean geometry appear in qualitative form. Such 
a postulate as that any distinct points A and B determine a 
straight line evidently does not essentially involve the notion 
of number. By successive steps it is then found possible to 
introduce quantitative ideas, such, for instance, as are in¬ 
volved in the Pythagorean Theorem. The final purely quan¬ 
titative expression of Euclidean geometry is to be found in 
the analytic geometry of Descartes, which is entirely based 
on number. Similarly, it is possible to start with the analytic 
formulation of geometry, and quickly establish by an inverse 
process the validity of the qualitative postulates first men¬ 
tioned. This kind of treatment of a mathematical theory may 
be indicated by the schema: 

qualitative <—► quantitative. 

This double process is characteristic of mathematical thought 
in both the fields of pure and applied mathematics. 

Someone not deeply acquainted with mathematics may 
well ask at this juncture: “But does this characteristic pas¬ 
sage from qualitative to quantitative occur in the case of 
logic and number?” The answer is in the affirmative. It is 
not difficult to formulate a basis for the integers and other 
types of numbers in terms of qualitative postulates; indeed 
such sets deserve to be a matter of everyday knowledge. 1 For 
example, starting with a collection of objects which we may 
call integers and with an operation of “addition” of two of 
them, a, b, to give a third, which we may write a + b, a 
first natural postulate would take the form a + b = b + a. 
The answer in the case of logic is not quite so obvious and has 
only been recently recognized. Nearly a hundred years ago 
the Irish mathematician, Boole, saw that logic was a kind of 
abnormal algebra in which there were only two symbols of 
quantity, namely 0 and 1. It has been realized only within a 
few years that logic permits of a normal algebraic formula- 

1 In our elementary text. Basic Geometry, Chicago, 1941, Professor Ralph Beatley 
and I have inserted a set of “laws of number.” 


894 



286 


American Scientist 


tion, provided that the fundamental operation of addition 
which Boole employed is modified. Thus starting from a 
simple set of qualitative logical postulates, we obtain an 
analytic logic based on numbers in the form of ordinary se¬ 
quence of 0 ’s and Ts. 1 

Before passing on to the consideration of other models of 
physical interest and their postulational treatment, it may 
be remarked parenthetically that, especially in the late nine¬ 
teenth century, the method of advance in theoretical physics 
was sharply contrasted with that employed in mathematics. 
In fact, the mathematical physicist was inclined to try to 
deal with the results of physical experiment without con¬ 
cerning himself more than was necessary about mathematical 
questions. It is true that the grandiose successes of New¬ 
ton were largely achieved through his remarkable mathe¬ 
matical genius; but after him it gradually became more and 
more the custom to look upon mathematics as merely a tool. 
As indicative of this attitude, I recall that not many years 
ago, after I had answered a query by a distinguished phy¬ 
sicist, he said: “I wish I had a mathematical slave.” 

However, nowadays it is hardly possible to regard mathe¬ 
matics as only a useful handmaid, since, for nearly four 
decades, the principal advances made in theoretical physics 
have resulted from mathematical insight and have involved 
a wider and wider circle of mathematical knowledge. In 
this period the successful theoretical physicists have been 
those endowed with mathematical vision, who believed in 
the power of the mathematical idea. The rank and file of 
physicists who lacked this quality have had to occupy them¬ 
selves either with verifying the theories so obtained in the 
laboratory or with elaborating details in the new theories. 

It is to be hoped that in the future more and more theo¬ 
retical physicists will command a deep knowledge of mathe¬ 
matical principles; and also that mathematicians will no 
longer limit themselves so exclusively to the aesthetic develop¬ 
ment of mathematical abstractions. 

\See M. H. Stone, The Theory of Representations for Boolean Algebras, Trans¬ 
actions of the American Mathematical Society, Vol. 40 , 1936 . Unfortunately no ele¬ 
mentary exposition of the central results is available. 


895 



Mathematical Nature of Physical Theories 287 

We turn now to mention briefly two further simple physical 
models and their postulational forms. The first is that of 
time series; and the second is that of a system of forces acting 
on a point. It may be remarked that the notion of time series 
has been the basis for an almost infinite amount of philo¬ 
sophic speculation; for example, a good deal of Bergson’s 
theory of creative evolution is based on the metaphysical 
thesis that the theorem which results with postulates for time 
like those given below, is somehow not valid. 

The postulates for the time series may be formulated 
readily: The “undefined class” of objects which we consider 
is that of “instants of time.” The “undefined relation” be¬ 
tween two (distinct) instants is that of “precedence”; thus 
if A precedes B in time, \ye write A < B. 

A set of qualitative postulates is then the following: 1 

I. If A < B, then A = B and B < A arc false. If A < B and 
B < C, then A < C. 

II. For any A there exist X, Y such that X < A < Y. 

Definition: An (open) interval is a set of points X of one of the four 
types: A<X<B;X<A;B<X;X arbitrary. 

III. If -A_, <A Q <Aj <A 2 -, 

then the set of A and of the intervals (A A , constitutes an 
interval. 

IV. Any two intervals can be put in 1 - 1 correspondence, preserving 
order (<). 

V. Any set of distinct intervals is numerable. 

On this basis it is not hard to prove that with each instant 
of time there may be associated a numerical variable which 
ranges from negative infinity to positive infinity, in such a 
way that if A < B, the number of A is less than that of B, 
and vice versa. This end-result or theorem constitutes the 
fundamental quantitative theorem of the theory. It is obvious 
conversely that if we have a collection of objects associated 
with numbers in this way, and interpret A < B as meaning 
that the number of A is less than the number of B, then all 
of the preceding system of postulates will be fulfilled. 

1 This set of postulates for the “time series” appears to be novel in the postulate IV of 
the isomorphism of all (open) intervals. The postulate V, of a type due to Souslin, is 
found to be necessary. I propose to publish a brief note on these postulates. 


896 



American Scientist 


288 

The great English mathematical physicist, James Clerk 
Maxwell, sometimes found it amusing to put his ideas in the 
form of mathematical verse. I have ventured to attempt some¬ 
thing similar, to which my friend the poet David McCord has 
given, at my request, a more poetic form: 

The instants of time form a limitless sequence. 

Precedence still rules them no matter their frequence. 

Each infinite set in an interval settles, 

Resembling its kin as a petal does petals; 

Besides, one can always in sequence review 
A collection of intervals standing in queue. 

This granted, good reason not rime will define 
That the instants are ordered like points on a line. 

We turn next to our first obviously physical model, namely 
that of a set of forces acting at a point. Almost every person 
has an intuitive grasp of the idea of forces, and their com¬ 
bination in a resultant force. A possible qualitative set of 
postulates for this simple physical universe is the following: 

1) Forces which act in the same line have a resultant 
whose magnitude is the sum of the magnitudes of the com¬ 
ponent forces. 

2) The resultant of forces does not depend on the order or 
grouping of the constituent forces, varying continuously 
with them. 

3) The resultant of forces is independent of the unit and 
of orientation in space. (Principle of Sufficient Reason.) 

It will be observed that here the postulates are not wholly 
qualitative, since the idea of the magnitude of a force is 
involved. In all physical theories the idea of number is 
accepted for convenience as if it were qualitative. 

The final quantitative result in this case is the familiar 
Parallelogram Law embodied in the next figure: 



In this case David McCord’s verse is as follows: 


897 



Mathematical Nature of Physical Theories 289 

The resultant of forces in line 

Is the sum of these forces. Combine 

Two or more which don’t happen to act 

In one line, it remains as a fact 

That their order and grouping don’t count. 

And again, disregard the amount 

The resultant is thrown out of joint 

Should some forces be changed half a point. 

Change a unit, direction in space— 

Your resultant still falls into place. 

• 

When forces so act without flaw, 

They obey Parallelogram Law. 

It will be noted that in the last postulate, reference is made 
parenthetically to the Principle of Sufficient Reason of Leib¬ 
niz. This is an extremely useful principle which may be 
employed whenever there is an underlying ambiguity. In the 
present case the ambiguity is that of orientation in space 
and of the unit of length. Later on, we shall observe that such 
ambiguity is always associated with a “group.” 

Let us attempt the direct use of the Principle before such 
further elaboration. In the first place, let us observe that if 
there are two equal forces F, their resultant force must lie 
in the plane which they define. For why should it fall on one 
side rather than on the other? Likewise it is clear that this 
resultant must lie along the bisector of the angle formed by 
the two given forces, for a similar reason. These simple con¬ 
clusions indicate how the Principle may be used. 

As a matter of fact we propose to go one step further and 
show that, in the special case when the two equal forces F 
are at right angles to one another, their resultant must be 
precisely the diagonal of the square which they bound, as 
required by the Parallelogram Law, and shown in the figure. 



F 

898 



290 


American Scientist 


This means that the magnitude of the resultant force must 
be F multiplied by the square root of two (V2 F), according 
to the Pythagorean Theorem. Of course we already know 
that the resultant will fall along the bisector in virtue of what 
has just been said above. 

In order to prove our statement we start with the given 
force F and add to it two equal forces of one-half the magni¬ 
tude, F/2, which are at right angles to the given force (see 
(b) in the figure below). Moreover we break up the force F 

k '/t* 


<•» 



into two parts also of magnitude F/2, one associated with 
each of the forces first introduced. At this stage (see (c)) we 
have a system of two pairs of forces whose entirety must be 
equivalent to the force from which we started, on the basis 
of the postulates written above. But now consider one of 
these pairs of two forces of magnitude F/2 and mutually 
perpendicular. Their resultant will fall along their bisector 
with a magnitude k times the magnitude of the two com¬ 
ponents. Thus the two pairs of forces reduce to two forces 
of magnitude kF/2 at right angles to one another (see (d)). 
Finally, since the size of the unit does not matter according 
to the Principle, we see that the final resultant will have a 
magnitude k times that of each of the two components, 
namely k 2 F/2 (see (e)). But this resultant force must, of 
course, be nothing other than the force from which we started. 
This shows that k must be the square root of 2. 

The short argument just made really constitutes a genuine 
introduction to the mathematical proof of the Parallelogram 
Law. It was this type of proof, for instance, which was given 
by Laplace in his Mecanique Celeste. 

With these preliminaries in mind, we are in a position to 
consider, advantageously, physical models of a less elemen- 


899 



Mathematical Nature of Physical Theories 291 

tary type and, in particular, such as satisfy the basic prin¬ 
ciple of causation or uniformity. This principle asserts that 
a given initial state of a system determines the unfoldment 
of all of the subsequent states. 

Between the many theories of this causational type there 
is a difference in character. Some have been designed to 
account for a special aspect of nature, but were never intended 
to be taken as foundational for physics as a whole. We will 
call the models used in such cases proximate models. Other 
models have seemed to contain within them some germ of 
possibility of universal application, and these will be termed 
ultimate models. Before taking up the study of some ultimate 
models, it will be worthwhile to consider briefly one impor¬ 
tant proximate model, namely, that involved in the so-called 
analytic theory of heat. 

Suppose that we consider a set of rigid bodies fixed in 
position in ordinary space. Each of these is thought of as 
isotropic and characterized by a certain ability to conduct 
heat and to contain it. In other words, at each point of these 
bodies there is a certain specific conductivity and a specific 
heat, and these constants are taken to be the same every¬ 
where in one and the same body. The fundamental inde¬ 
pendent variables which enter are those of position and time, 
whereas the fundamental dependent variable is the tempera¬ 
ture. For this theory, the postulates are usually stated in 
simple quantitative form, although they readily admit of 
partially qualitative formulation, as when it is said that heat 
always flows from warm to cold. These postulates lead to a 
single central differential equation, the so-called equation 
for the flow of heat. Thus the definitive quantitative form 
of the theory is contained in this equation. By its aid, we 
are able to determine the flow of heat which will result from a 
given initial state of the system. In this way Fourier was 
able to solve a great variety of problems. It is interesting to 
recall how Kelvin later on applied Fourier’s theory to the 
problem of estimating the age of the earth. For this particular 
case the earth was taken as an isotropic sphere at high initial 
temperature from whose surface heat was conducted away 
into empty space according to known physical laws. By usins 


900 



292 


American Scientist 


the known data concerning the temperatures in deep mines, 
it was possible to obtain a reasonable quantitative estimate. 
Of course since the discovery of the phenomenon of radio¬ 
activity this estimate has been discarded. Evidently this 
model of Fourier is one of proximate type. 

Later on, heat was explained by Maxwell and others by 
means of the so-called kinetic theory of heat. In this theory 
heat is conceived of as a mode of motion of the atoms con¬ 
stituting the body in question. Roughly speaking, a rise in 
temperature is taken to correspond to a greater average 
velocity of the atomic particles. With this interpretation, 
it is possible to justify the postulates of the analytic theory 
of heat. Evidently the kinetic theory is of more basic type, 
and might provide an ultimate explanation of heat—if the 
universe were dynamical instead of electrodynamical! 

The conceptual models which have been employed by 
physicists are extremely varied and numerous. It almost 
seems as if Nature had desired to anticipate every possible 
mathematical system, and to employ every kind of invention. 
My distinguished friend and colleague Bridgman, to whom I 
made a remark of this sort some time ago, replied that, so far 
as he knew, Nature had never used the principle of the wheel 1 
At any rate, the varied models exemplified in Nature seem 
to be beyond the range of human imagination, and we are 
only commencing to understand a few of them. 

Practically all of the proximate models of Nature share 
certain characteristics due to the fact that they accept the 
principle of uniformity. The postulates, expressed in terms 
of number, soon lead to certain basic differential equations 
which are expressed in the symbolism of the calculus. Thus 
the theory of the model is reduced to the solution of differen¬ 
tial equations. The treatment of the motion of our planetary 
system provides a well-known example. 

We now pass to a consideration of the principal attempts 
at ultimate theories from the time of Newton onwards. So 
far as their mathema'tical nature is concerned, we may note 
in advance the following general conclusions: There is in 
each case an underlying four-dimensional conceptual frame¬ 
work of space and time, with the possible exception of the 


901 



Mathematical Nature of Physical Theories 293 

latest quantum-mechanical phase. In each case there is a 
fundamental ambiguity and corresponding mathematical 
“group” so that the Principle of Sufficient Reason may be 
applied. Finally, corresponding to each group there is an ap¬ 
propriate mathematical language. 

To a considerable extent the various types of theory may 
be obtained from one another, by a process akin to trans¬ 
position to the modified language. 

We shall also attempt to point out some of the mathe¬ 
matical difficulties and physical limitations in the various 
cases. In conclusion, some remarks of a general character 
will be made. 

* » * 

Classical physics starts with the framework of Absolute 
Time and Absolute Space. This is the framework to which 
all of our commonsense experience in everyday life inevitably 
leads us. Let us consider the character of this framework 
from what may be called the group-theoretic point of view. 

With Newton, we think of time as an equably flowing 
variable, the same for all observers. However, there is an 
underlying ambiguity in that the various instants of time 
are wholly indistinguishable from one another. Once, how¬ 
ever, a particular instant or epoch has been chosen from 
which to measure time, all other instants may be charac¬ 
terized uniquely by their temporal distance from the epoch. 
Thus the epoch chosen for practical purposes is the begin¬ 
ning of the Christian era. 

It is interesting to recall that, from the philosophic point 
of view of Leibniz, this ambiguity of structure precluded the 
possibility that Time is an Absolute Being. Briefly, his argu¬ 
ment was that God would reject Time because none of the 
decisions of God are ambiguous. In this way Leibniz con¬ 
cluded that both Space and Time were relative rather than 
absolute. 

For us, however, it is the group involved, reflecting this 
ambiguity, that is significant. To explain this idea, let us 
fix attention upon a square ABCD. There are four opera¬ 
tions of rotation which leave the square in the same position: 


902 



294 


American Scientist 


namely, the operation I (the so-called identity) which does 



nothing; the operation R which rotates the square through 
90° ; the operation S which rotates it through 180° ; and the 
operation T which rotates it through 270°. The successive 
combination of any two of these operations amounts to a 
third. Thus R, followed by S, yields T: RS =T. Thus we 
get a “multiplication table” of the group of operations which 
embodies all of its essential properties. In the adjoining 
figure we give this table: 




1 • R $ T 




1 


1 R S T 

R 


R S T 1 

s 


STIR 

T 


T 1 ■ S 


It will be noted that for each operation there is another which 
undoes what the first does. Thus R followed by T does noth¬ 
ing, that is, RT = I. 

Any collection of operations which can be combined in 
this way and which includes, along with any operation, 
another inverse operation which undoes what the first does, 
is called a group. The identical operation, I, which does 
nothing, is always considered to be a member of the group. 

In the case of the model of Absolute Time, the operations 
of the group are those which add or subtract a constant value 
to or from the measure of time, or, what is really the same 


903 




Mathematical Nature of Physical Theories 295 

thing, shift the epoch. It will be observed that such a shift 
will not change in the least the interval of time between two 
instants. Such an unchanging quantity under the operations 
of a group is said to be an “invariant. ,, Likewise, we may 
speak of “invariable properties.” Thus in the case of the 
group of the square, the property that two vertices are oppo¬ 
site or adjacent is invariable. 

In Absolute Space, there is a similar ambiguity but of 
more extensive type. All directions in space and all reference 
points are on a common basis. It was for this reason that 
Leibniz asserted that Space was not an Absolute Being. From 
the point of view of the theory of groups, we may say that 
all translations and rotations of the space into itself form 
its group, and the corresponding invariant is that of the dis¬ 
tance between two points. There are also other invariants, 
such as the angles between two lines, but these may be con¬ 
sidered to be derived from the still more basic invariant of 
distance. 

It could be shown mathematically that the theory of one¬ 
dimensional time and three-dimensional space, as presented 
ordinarily, is contained in the theory of the associated groups 
and their invariants. 

Now, although Newton did not explicitly recognize the 
fact, there was involved in his theory a more extensive ambi¬ 
guity than that to which we have referred above. His under¬ 
lying postulates were such that they did not distinguish 
between any particular framework and any other which was 
moving with uniform velocity in a straight line with respect 
to the given framework. Furthermore, there is almost nothing 
in the Newtonian theory which differentiates between units 
of length or units of time. In this classical theory there appear 
two fundamental invariants, namely, the interval of time 
between events, and the distance between points at a certain 
time. 

Mathematics has provided a classical language which is 
appropriate to the Newtonian theories and the underlying 
group. This is the language of 3-vectors. A 3-vector is a 
directed quantity, like a velocity, which is characterized by 
its direction and magnitude in ordinary three-dimensional 


904 



296 


American Scientist 


space. This abbreviated vectorial language is used constantly 
in all of classical physics when it is desired to express the 
theory in as condensed terms as possible. 

As I have shown elsewhere, the essential significance 
of Leibniz’ Principle of Sufficient Reason is group-theoretic. 1 
The general relationship involved may be indicated by the 
synoptic formula: 

Metaphysics < - > Principle of «-► Theory of 

Sufficient Reason Ambiguity and Groups 

Of course from our present point of view, we must think of 
Physics as a special branch of Metaphysics. 

To bring out more clearly the application of this group- 
theoretic principle, let us consider for a moment the law of 
gravitation of Newton. If we have two gravitating particles, 
it is clear that the forces between them must lie in the line 
joining the two particles, inasmuch as spatial rotations about 
this line are operations of the group of motions, and the 
resultant forces must be independent of these operations. 
Likewise if the two particles are relatively at rest, the forces 
can only depend upon the mutual distance. Of course it is 
natural to take the two forces to be equal, since observation 
shows that under ordinary circumstances action and reaction 
are equal; and it is simplest to suppose that the force only 
depends upon the distance even if the particles are in relative 
motion. Furthermore, it is philosophically plausible to con¬ 
jecture that the force varies continuously with the distance 
and tends to become small at great distances. In this way 
the law of gravitation of Newton is indicated as one of the 
simplest and most reasonable possibilities, on the basis of 
the underlying group and the requirement of mathematical 
simplicity. Another equally simple possibility is that the 
force might be inversely proportional to the first power of 
the distance instead of the second power as in the Newtonian 
law. But this possibility is ruled out by Kepler’s planetary 
laws for the motion of two attracting bodies. 

Nearly all of classical physics of the dynamical era has 
been constructed more or less consciously on the basis of 

1 The Principle of Sufficient Reason, The Rice Infinite Pamphlet, January, 1941. 

905 



Mathematical Nature of Physical Theories 297 

group-theoretic considerations mixed with philosophical re¬ 
quirements of the type which we have just indicated. This 
method has shown itself enormously successful in advancing 
our knowledge of the physical universe. 

Besides the conceptual framework of space and time which 
forms the stage of classical physics, we must specify concep¬ 
tual models of matter which may occupy this space in time. 
The starting point is naturally the particle and the rigid 
body. Then follow models convenient for hydrodynamics 
and elasticity, such as isotropic adiabatic fluids and isotropic 
elastic solids. Upon these may be imposed certain geomet¬ 
rical constraints, as, for example, when the center of a rigid 
sphere is constrained to be at a fixed point. More generally, 
various forces of attraction or repulsion consistent with the 
conservation of energy are allowed. Thus we are led to 
dynamical systems essentially characterized by differential 
equations of the Lagrangian type. These equations may be 
written down as soon as the explicit expressions of the 
“kinetic” and “potential” energies have been determined. 
The celebrated principle of the conservation of energy affirms 
that the total energy, both kinetic and potential, always 
remains constant. 

It is a remarkable mathematical fact that the Lagrangian 
equations of a complicated dynamical system may be derived 
from such a unitary principle as the Principle of Least Action. 
It was Maupertuis and Leibniz who first stressed the impor¬ 
tance of minimum or variational principles such as that of 
least action in physics. Later on, we shall attempt to assess 
to some extent their actual importance for physics. At the 
moment, however, we merely observe that in the Lagrangian 
formulation it was the difference between kinetic and poten¬ 
tial energy which was involved in this formulation, and this 
difference has no direct physical significance. 1 Similarly, 
there was an artificial element in the Hamiltonian variational 
condition, called Hamilton’s Principle, which was later ob¬ 
tained and which apparently involved only the total energy 
H, for, in order to express the equations of motion by means 

1 Sec Jerome Fee, Maupertuis and the Principle of Least Action, American Scientist, 
April, 1942, and Scientific Monthly, June, 1941, for a discussion of this principle. 


906 



American Scientist 


298 

of this new principle, it was necessary to double the number 

of dependent variables. # . 

Despite the artificiality of such variational principles, they 
were regarded as of outstanding theoretical significance 
throughout the classical era. The dynamical formulation of 
physical laws was regarded as likely to lead to an ultimate ex¬ 
planation, even after it was recognized that electricity played 
an absolutely fundamental role in the atomic domain. In 
fact Maxwell found that his electromagnetic equations could 
be derived from a suitable variational principle, and it became 
the consensus of opinion that in such principles was to be 
found the mystical element uniting matter, electricity, and 
the ether. 

It was not noticed that such an ether did away completely 
with the relativity of motion which had been present in the 
Newtonian framework. The group allowed was now reduced 
to a mere change of point of reference or of orientation in 
space. 

From the new point of view, rigid bodies and other types 
of matter were thought of as the bearers of electric charges 
whose motions were to be determined not only by the physical 
forces of dynamical origin, but by the electrodynamic forces 
as well. In this spirit, suggestive models of atomic structure 
were proposed by Rutherford (1911) and others, on the basis 
of a positively charged nucleus surrounded by a certain num¬ 
ber of negatively charged electrons in positions of equi¬ 
librium. Furthermore, a great many optical phenomena, such 
as optical rotation, double refraction in crystalline media, 
etc., were accounted for. To be sure, in each case an ad hoc 
atomic model was constructed in order to obtain an explana¬ 
tion. 

Thus at the end of this dynamical era, there was prevalent 
in most physical quarters a conviction that by a judicious 
combination of dynamical and electrodynamical laws gov¬ 
erning neutral and electrically charged masses moving in a 
fixed electromagnetic ether, it would be possible to furnish 
an explanation of all the phenomena of Nature. The Prin¬ 
ciple of Least Action and other variational principles were 
conceived as somehow affording the basis of unification. 


907 



Mathematical Nature of Physical Theories 299 

The purely electromagnetic era was ushered in by the re¬ 
markable experiments of Michelson and Morley in 1887. 
Taken at their face value, these suggested that matter was not 
in the least affected by its motion through the ether, and it be¬ 
came natural to extend this idea as far as it could be con¬ 
veniently applied. Around 1892 the Irish mathematical 
physicist, Fitzgerald, and his Dutch contemporary Lorentz, 
made the hypothesis that matter was contracted in the direc¬ 
tion of its motion by a certain fraction dependent upon its 
velocity with respect to the ether. In this way the negative 
results of the Michelson-Morley experiments could be ade¬ 
quately accounted for. Somewhat later another Irish mathe¬ 
matical physicist, Larmor, assumed explicitly that a similar 
contraction took place in time, that is to say, moving clocks 
were supposed to be slowed down in the same ratio. In this 
way, it became gradually apparent that the complete electro¬ 
magnetic equations could be treated so as to afford a natural 
explanation of the null effect of motion through the ether. 

Today it is evident that, by the thoroughgoing use of the 
“Lorentz group” associated with this type of spatiotemporal 
contraction, it is possible to give a qualitative approach to the 
electromagnetic equations so that they appear to be as na¬ 
tural from the group-theoretic point of view as does Euclidean 
geometry. It must be emphasized that, before this group and 
its associated ambiguity were introduced, the exact form of 
the electromagnetic equations had appeared as extraordinar¬ 
ily artificial. Here, then, was a first great triumph of the 
purely electromagnetic point of view. 

Another remarkable fact was that the form of relativity of 
motion appearing in the new theory was really that which 
would be suggested naturally to an astronomer who looked 
out upon the stellar universe with completely impartial view. 
To him, all of the stars would seem to be on an equal basis, 
and purely mathematical argumentation, in which it was pos¬ 
tulated that such an equivalence existed, inevitably leads to 
the same form of space-time framework as before. We may 
therefore call the space-time background “astro-electric.” 

This modification of cosmic outlook was brought about 
largely through the work of Larmor and Lorentz, but it was 


908 



American Scientist 


300 

Einstein who first, in 1905, embodied it in his “special theory 

of relativity” , . n . 

In the new world thus set up, there is essentially only one 
invariant, which Lorentz called “local time”; it will be re¬ 
called that in the older theory there were the two invariants 
of distance and of time. The group involved restores the rela¬ 
tivity of motion which had been lost when the concept of a 
stationary ether made its way into physical speculation. This 
Lorentz group mixes up space and time inextricably so that 
the concept of simultaneity no longer retains absolute sig¬ 
nificance. Nevertheless, the new group involves exactly as 
many arbitrary constants (nine) as the Newtonian group, 
and hence it is fair in a certain sense to say that the new rela¬ 
tivistic framework is no less absolute than the classical frame¬ 
work ! 

At the present time, every worker in the electromagnetic 
theory assumes that the background in which phenomena 
take place is of this type. The reason is that, in this way, a 
complete mastery of the equations and their transformations 
is readily obtained. 

The basic language now becomes that of 4-vectors, whereas 
in the previous Newtonian era that of 3-vectors had been 
fundamental. The new language obeys the same formal laws 
as the old, but the vectors involved have four components in¬ 
stead of three. 

What is lost in this change is the possibility of an analogue 
of the rigid or elastic body. Since an electrically charged 
body tends to expand, due to the enormous forces of electrical 
repulsion which exist, something like the rigid or elastic body 
is necessary to prevent the atom from exploding. However, 
it has not been found possible to devise any adequate substi¬ 
tute for such bodies in electromagnetic space-time, so that 
this difficulty remains to the present day. It might be thought 
that the point particle of electrically charged matter would 
provide an adequate carrier of the electrical charge, but, due 
to the fact that infinite forces and energy are involved, the 
mathematical difficulties of dealing with such particles are 
found to be extremely great, and it seems doubtful that it is 
possible to employ the particle model as a basis. 

909 



Mathematical Nature of Physical Theories 301 

It thus becomes a fundamental question as to what types 
of matter can be used. Cosmic dust, that is, matter char¬ 
acterized by density without pressure, can collapse to a point 
of finite mass, can interpenetrate other cosmic dust, and can 
even turn inside out. Furthermore, electrically charged cos¬ 
mic dust would tend to expand and scatter with incredible 
rapidity. If, however, we introduce the simplest type of mat¬ 
ter in which pressure occurs, namely an isotropic adiabatic 
fluid, there is a fundamental difficulty which arises when two 
portions of fluid collide from opposite directions at velocities 
exceeding the disturbance velocity in the fluid. In this event 
the mathematical equations themselves break down. 

Consequently it appears there is really only one type of 
matter which is self-consistent from the mathematical point 
of view, namely, that in which the disturbance velocity at all 
densities is precisely the velocity of light. In 1926 I intro¬ 
duced this form of matter under the name of the “perfect 
fluid.” The equation of state of such a fluid was found to be 
simply (in units of seconds and light-seconds) : 

pressure = density. 

This fluid is supposed to be in equilibrium at a “cosmic pres¬ 
sure” not zero. The perfect fluid is very nearly incompres¬ 
sible and of almost invariable mass under ordinary circum¬ 
stances. 

It should be added that up to the present time there has 
been no indication whatsoever that physicists would be at 
all interested in such a perfect fluid just because it is the 
unique form of isotropic adiabatic fluid which can carry elec¬ 
tricity without introducing mathematical difficulties. In¬ 
deed, the question of mathematical consistency has never 
worried theoretical physicists very much. 

It is my considered opinion that isotropic adiabatic fluids 
other than the perfect fluid are not only faulty in mathe¬ 
matical construction, but are aesthetically inappropriate to 
the background of electromagnetic space-time. Neverthe¬ 
less such forms of isotropic adiabatic fluid have been used 
freely by Einstein and others for theoretical purposes. 

In 1912, Nordstrom attempted to develop a simple theory 


910 



302 


American Scientist 


of gravitation which would be appropriate to the new space¬ 
time. His theory accounts adequately for ordinary gravita¬ 
tional phenomena. However, it was at once noted that, ac¬ 
cording to this theory, the perihelion of Mercury would ex¬ 
perience a retrogression rather than 'the advance which had 
been observed. Consequently this theory was immediately 

abandoned. , . ... c 

Very recently 1 I have found that by introducing the perfect 

fluid as the basic form of matter, it is possible to formulate in 
the same language of 4-vectors a simple theory of gravitation 
which accounts for general gravitational phenomena, and also 
explains the three'crucial effects that have been observed. 
One of these effects is that of the advance of the perihelion of 
Mercury referred to above, and the other two are the bending 
of light by the Sun, and the shift of spectral lines towards the 
red in light coming from the Sun, both of which had been pre¬ 
dicted by Einstein’s theory of 1916. My theory was suggested 
in part by the form of the theory of Einstein, and in part by 
the Nordstrom theory. All of these attempts by Nordstrom, 
Einstein, and myself are in a certain sense direct transposi¬ 
tions of the theory of Newton, on the basis of 4-vectors instead 
of 3-vectors in the cases of Nordstrom and myself, and on the 
basis of the tensors appropriate to curved space-time in the 
case of Einstein. 

In my opinion, the conceptual possibilities of the electro¬ 
magnetic framework deserve the closest study. Unfortun¬ 
ately, changes in the point of view of theoretical physicists 
have been so rapid in the last decades that promising direc¬ 
tions of advance have been hastily given up without due ex¬ 
ploration of their possibilities. The mathematicians, if not 
the physicists, should take up the highly important task of 
studying more deeply the construction of physical models on 
the basis of electromagnetic space-time. 

* * * 

1 See my paper presented on February 20, 1942 at the occasion of the Pan-American 
Astro-Physical Congress held at Puebla, Mexico, in connection with the inauguration of 
the Astronomical Observatory at Tonantzintla. under the title El Concepto de Ttempo 
y la Gravitation. It will be published in the Proceedings of the Congress. Also see a 
forthcoming article Newtonian and Other Forms of Gravitational Theory , shortly to 
appear in the Scientific Monthly, and a brief technical account in the Revista de Ciencias 
of the University of San Marcos, Lima, No. 441, 1942. 


911 



Mathematical Nature of Physical Theories 303 

When it was found that Nordstrom’s and allied attempts 
at direct extension of the Newtonian gravitational theory to 
electromagnetic space-time failed to provide an explanation 
of the first crucial effect specified above, it was natural to try 
to devise other forms of gravitational theory. The outcome 
was Einstein’s theory of relativity of 1916 which seemed, at 
first glance, to suggest that physics might be nothing more 
than a kind of four-dimensional geometry. 

This theory was based on the hypothesis of a curved space- 
time in which the small portions of space-time are essentially 
of the same type as electromagnetic space-time. The single 
invariant of local time (analogous to arc length) was made to 
depend upon ten gravitational potentials, and the theory came 
under the type of four-dimensional geometry studied earlier 
by Riemann. Ricci and Levi-Civita had shown, too, that for 
the expression of geometrical properties in such a Riemannian 
geometry, the language of tensors was appropriate. The 
properties of such tensors were developed in the “absolute dif¬ 
ferential calculus.” Fortunately, Einstein found this instru¬ 
ment available in the elaboration of his basic Principle of 
Equivalence. This principle affirmed that space-time is 
curved by matter, and that the ultimate parts of matter follow 
straight lines in this curved space-time so far as is possible. 

Now it was postulated at the outset in the new theory that 
the underlying “group” was general, so that there was no 
longer any kind of absolute underlying framework. In other 
words, proper independent variables were no longer available. 
This fact enormously increased the mathematical difficulties 
of the theory without yielding counterbalancing advantages 
in any other direction. For this reason, the generalized rela¬ 
tivity of Einstein has never entered effectively into physical 
speculation. Larmor probably expressed the general opinion 
among physicists when he called this theory “an auxiliary 
construct,” and said “the absence of space and time and mo¬ 
tion in the auxiliary construct is against reality.” 1 

One advantage of the alternative theory which I have men¬ 
tioned is that it preserves the basic electromagnetic-frame¬ 
work. Of course it is too early to say whether these theories 
1 On Relativity and Convection, Mathematical and Physical Papers (1929), Vol. 1. 


912 



304 


American Scientist 


or some other gravitational theory lies in the direct line of 
progress. ... 

What was most effective, however, in preventing the new 
relativistic theories from being followed out thoroughly was 
the increasing realization that atomic phenomena seem to be 
inexplicable on any basis such as that provided by the back¬ 
ground of a four-dimensional space-time continuum. It was 
Planck who, in 1900, introduced the revolutionary idea of 
quanta of energy in order to meet some of the difficulties. 
The later attempt of Niels Bohr (1913) provided a quasi- 
dynamical model of the atom which served to explain in a 
very suggestive way many spectroscopic facts on the basis of 
quanta of energy. It is true that Bohr, in achieving his pur¬ 
pose, introduced ad hoc hypotheses, but the fundamental ad¬ 
vance was that he had provided a formalism by which predic¬ 
tions could be made. At the hands of Sommerfeld and other 
physicists who followed Bohr, many triumphs were scored by 
the theoretical physicists, and all turned to the study of 
atomic phenomena to which the new theories might be 
applied. 

In these quantum-mechanical theories, the hitherto basic 
concept of a four-dimensional continuum of space and time 
is completely abandoned without a qualm. Attention is 
generally fixed on a particular atom or molecule and its im¬ 
mediate neighborhood, and nothing else is considered. 

The culmination of this type of theory came in 1926 when 
Schrodinger formulated his celebrated “wave equation.” It 
is interesting to recall the general character of Schrodinger’s 
treatment. He starts out with the conventional classical 
model of an atom as composed of a positively charged proton 
having a given mass, and corresponding negatively charged 
electrons which are of much smaller mass. The equations of 
motion are then of usual Hamiltonian form. Using certain 
vague optical ideas, Schrodinger finds a way to associate a 
certain linear wave equation with these dynamical equations. 
Under appropriate “boundary conditions,” this Schrodinger 
wave equation has solutions possessing certain frequencies 
and these lead to the determination of the position of atomic 



Mathematical Nature of Physical Theories 305 

spectral lines, of their intensities, etc. Looked at in the cold 
light of logic the processes involved seem to be a kind of 
mathematical abracadabra, but the result is extremely suc¬ 
cessful and suggestive just the same. The lack of cogent rea¬ 
soning throughout has often been justified by the assertion 
that light and electromagnetic phenomena are to be ex¬ 
plained alternatively on both a particle and a wave basis ! 

To give a correct impression of the significance of such 
wave-mechanical theories, it is necessary to make several 
comments. In the first place, it had always been supposed 
that light emitted from an atom was the result of electromag¬ 
netic vibration, and it was reasonable to expect that the equa¬ 
tion for such a vibration might be a linear partial differential 
equation, like the Schrodinger equation. But no conceptual 
method of obtaining his equation has been advanced except 
my own very incomplete attempt of 1926 1 when I introduced 
the concept of the perfect fluid. In this theory, I supposed 
that the proton and electrons could freely interpenetrate so 
that in the normal state of equilibrium there was complete 
neutralization of the electric charge. Under disturbances of 
the state of equilibrium, certain frequencies of vibration were 
obtained. This theory seems to me to be obviously in need of 
further development. Perhaps it does show, however, that an 
explanation based on a four-dimensional space-time con¬ 
tinuum of electromagnetic type is within the bounds of pos¬ 
sibility. 

To my mind, the Schrodinger equation is a successful at¬ 
tempt to find underlying equations, without providing a suit¬ 
able conceptual setting. My notion can be illustrated by the 
following situation which might have occurred in the develop¬ 
ment of Newtonian gravitation. It will be recalled that 
Kepler formulated three simple and well-known laws govern¬ 
ing the motion of two bodies, such as the Sun and the Earth. 
Now suppose that someone had propounded the following 
theory: In the planetary system no other laws operate than 
these laws of Kepler between two members of the system. 
From time to time the pair of bodies moving according to 

1 A Theory of Matter and Electricity and The Hydrogen Atom and the Baimer For¬ 
mula, Proceedings of the National Academy of Sciences, Vol. 13 (1927). 


914 



3 Q£ American Scientist 

these laws changes arbitrarily, just as Newton had conjec¬ 
tured his light corpuscles might have “fits of easy transmis¬ 
sion and reflection.” Meanwhile, all of the other bodies, of 
course, not involved in the binary Keplerian dance, are sup¬ 
nosed to move in a straight line at constant velocities—until 
they join in the dance. It can then be shown that the motion 
of the bodies in the planetary system must be essentially along 
Newtonian lines. 1 Furthermore, there is evidently involved an 
Indeterminacy Principle very analogous to the Indeter¬ 
minacy Principle introduced by Heisenberg and regarded as 
basic in quantum mechanics. Or shall we rather say that the 
change of partners in the binary dance is governed by an Ex¬ 
clusion Principle like that of Pauli? 

Yet who would declare that such an ad hoc theory of the 
planetary system, which would be nearly equivalent to the 
Newtonian theory as far as predictive power is concerned, is 
an adequate substitute for the grandiose conceptual theory of 

Newton? , , . 

It seems to be inevitable that somehow or other a back¬ 
ground of space and time must eventually play a part in any 
truly satisfactory account of our physical universe. 

• • • 

Having completed this hasty analysis of the development 
of basic physical theory since the time of Newton, we are in 
a position to summarize our general conclusions advan¬ 
tageously. 

In the first place, all of these theories, except that of quan¬ 
tum mechanics, employ a selected four-dimensional frame¬ 
work of space and time which is to serve as the stage for the 
interaction of matter and electricity. In all of these, the prin¬ 
ciple of the uniformity of nature is taken for granted. The 
laws which are postulated generally lead to one or more or¬ 
dinary or partial differential equations, so that finally the 
basis of all theory turns out to be a set of differential equa¬ 
tions—the differential equations of the physical universe! 

Whether or not quantum mechanics, in which space and 
time only function as preliminary scaffolding, can be accom- 

* If there are n bodies in the system, the values of the masses would have to be thought 
of as increased in a certain ratio n(n-l)/2. 


915 



Mathematical Nature of Physical Theories 307 

modated within such a framework remains to be seen. How¬ 
ever, the illustration given above may serve to show how it is 
possible to prefigure the formulas of a theory without provid¬ 
ing a suitable conceptual basis. Thereby it is suggested that 
the fundamental Schrodinger equation may amount to a 
genial formalistic conjecture, yet to be justified. 

It was also apparent that in each theory there was a basic 
underlying group, with corresponding invariants and mathe¬ 
matical language. In the classical case the group was that of 
Absolute Space and Time, augmented by uniform translatory 
motions; distance and interval were the two fundamental in¬ 
variants ; and the basic language was that of 3-vectors. Here 
the underlying model was dynamical, and the space and 
time were those of everyday intuition. Furthermore the ac¬ 
cepted laws were always chosen in accordance with the under¬ 
lying group and the allied Principle of Sufficient Reason. 
The dynamical formulation was extremely elegant in terms 
of suitable variational principles, such as the Principle of 
Least Action and Hamilton’s Principle. 

As electromagnetism began to be developed, the concept 
of the ether was introduced; this meant a limitation of the 
underlying group, since motion now appeared as absolute in 
the ether. The so-called electromagnetic equations could be 
given 3-vector form, but their formulation seemed entirely 
artificial, in marked contrast to the equations of the older 
Newtonian physics, for example, those embodied in the so- 
called Parallelogram Law for the composition of forces. 

But a little before the beginning of the present century, 
when the electromagnetic nature of the atom began to be 
appreciated, and when the Michelson-Morley experiments 
and others had indicated the complete indifference of matter 
to its passage through the ether, there began to occur a funda¬ 
mental change in outlook. This eventuated in the use of a 
new fundamental four-dimensional substratum of space- 
time, that of the special theory of relativity. The model here 
used was “astro-electric,” in that it was equally appropriate 
from the point of view of electromagnetic and of stellar phe¬ 
nomena. The underlying group was the Lorentz group; the 


916 



308 


American Scientist 


single invariant was local time; and the appropriate language 
was that of 4-vectors. 

The necessary reformulation of the fundamental properties 
of matter and electricity amounted to a modulation of the 
older theory in the direction suggested by the change in the 
basic mathematical language. Here there was the great ad¬ 
vantage that, on the basis of the Principle of Sufficient Rea¬ 
son, the electromagnetic equations appear natural against the 
new spatiotemporal background. But counterbalancing dis¬ 
advantages appeared, in particular the convenient concepts of 
the rigid and elastic body had to be abandoned. The theory of 
gravitation of Nordstrom (1912) was based on the new 
framework but was at once given up, since it did not predict 
the observed perihelial advance of Mercury. However it 
has been pointed out that a new simple theory (based on the 
“perfect fluid ,, ) is now available in this type of space-time. 

With this electromagnetic framework the idea of energy 
became more obscure, and variational principles seemed to 
retain only limited significance. Moreover no way was found 
of improving the older treatment of optical and electrical 
phenomena. 

Einstein’s theory of gravitation of 1916 also had its group 
(the general group), its one invariant (local time), and cor¬ 
responding language (that of tensors). It was based on the 
model of a space-time curved by matter, and afforded a bril¬ 
liant account of gravitational phenomena. Unfortunately 
the new spatiotemporal background is extremely complicated 
mathematically. There are no longer four true independent 
variables. Thus this interesting form of theory has seemed 
essentially unworkable. Here too variational principles oper¬ 
ate in incomplete form. 

In the final quantum-mechanical phase of today there is a 
mystical employment of variational principles and optical 
analogies of classical type in order to pass directly from the 
approximate dynamical interpretation of the atom (proton 
and electrons) to the corresponding wave equation of Schro- 
dinger. This is a powerful engine for making calculations, 
and such calculations satisfy many theoretical physicists, for 
they lead to new power and progress. Says Dirac in this con- 


917 



Mathematical Nature of Physical Theories 309 

nection 1 , “The only object of theoretical physics is to calculate 
results that can be compared with experiment , and it is quite 
unnecessary that any satisfying description of the whole 
•course of the phenomena should be given.” 

But nothing, it seems to me, is more evident than that the 
method proposed for making these calculations has not yet 
been properly placed in the wider scheme of the universe as a 
whole. In fact I believe that there is a definite possibility of 
using electromagnetic space-time for this purpose. This pos¬ 
sibility should be studied most carefully. 

It is the Principle of Sufficient Reason and the allied theory 
•of groups that seem to me most significant for physical specu¬ 
lation. I have ventured to formulate this aspect as follows . 2 

PRINCIPLE OF SUFFICIENT REASON. If there appears in any theory T a set 
of ambiguously determined (that is, symmetrically entering) variables, then these 
variables can themselves be determined only to the extent allowed by the corresponding 
group G. Consequently any problem concerning these variables, which has a uniquely 
determined solution, must itself be formulated so as to be unchanged by the operations 
of the group G (that is, must involve the variables symmetrically). 

HEURISTIC CONJECTURE. The final form of any scientific theory T is: (1) based 
on a few simple postulates; and (2) contains an extensive ambiguity, associated sym¬ 
metry. and underlying group G. in such wise that, if the language and laws of the theory 
of groups be taken for granted, the whole theory T appears as nearly self-evident in virtue 
of the above Principle. 

Indeed it may be true in some mystical sense that “God 
thinks multi-dimensionally,* whereas men can only think in 
linear syllogistic series, and the Theory of Groups is the ap¬ 
propriate instrument of thought to remedy our deficiency in 
this respect.” Mathematics seems to be basically occupied 
with model making, and the basic characteristic of a good 
model is found to be its elegance in a group-theoretic sense. 

But it may be asked, What is the role of Variational Prin¬ 
ciples, such as the Principle of Least Action? Are they not 
equally important and basic? I believe that they are, in that 
the language of variational principles forms a kind of higher 
symbolism, above the calculus, so that, while the calculus 
is fundamental in dealing with physical theories, its processes 
may be replaced by more compact expression in variational 
terms. The true significance of variational principles in 
special cases remains extremely obscure. For example, an 

1 Quantum Mechanics (1930), p. 7. 

2 Rice Institute, ibid., p. 45. 

8 That is, uses multi-dimensional symbols beyond our grasp. 


918 



310 


American Scientist 


arbitrary set of ordinary differential equations is readily 
given variational form in a limited part of the domain of 
the variables. It is only the fact that there is a single ex¬ 
plicit variational form available in the entire domain of the 
independent variables that is really significant. Possibly this 
interesting situation indicates that the basic importance of 
variational principles will be found to be topological. 

This leads me to remark that topology deserves to obtain 
a more prominent position in physical theories than it has 
yet obtained. Topology may be defined as the rudimentary 
form of geometry, which is not concerned with shapes or size, 
but rather with the connections of geometric entities of vari¬ 
ous dimensions. 

Thus in the mathematical patterns to be utilized in physics 
in the future, it may be conjectured that group theory together 
with the allied Principle of Sufficient Reason, Variational 
Principles, such as the Principle of Least Action, and To¬ 
pology, will play an important part. 

Physical theories have been, and always will be, mathe¬ 
matical theories; and the history of physics shows inversely 
that those mathematical theories which have not originated 
in a physical model but have been directly created by the hu¬ 
man mind, are likely to be exemplified later on in physical 
applications. 

It will probably be the new mathematical discoveries which 
are suggested through physics that will always be most im¬ 
portant, for, from the beginning, Nature has led the way and 
established the pattern which mathematics, the language of 
Nature, must follow. 


919 



Reprinted from the Proceeding* of the National Academy of Sciences, 
Vol. 29. No. H. pp 231-239. August. 1943 


MATTER , ELECTRICITY AND GRAVITATION IN FLAT 

SPACE-TIME 

By George D. Bxrkhopp 

Harvard University 
Communicated July 13. 1943 

In 1927 I presented two Notes* in which there was attempted a con¬ 
ceptual approach to the then new Schrodinger wave equation. This was 
done by taking matter to be a “perfect fluid,’' defined against the back¬ 
ground of the curved space-time of Einstein's celebrated gravitational 
theory of 1916. 

In February of last year I had the honor of presenting at Puebla and 
Tonantzintla, Mexico, before the Astrophysical Congress convening there, 
a new gravitational theory based on the same perfect fluid in the much 
simpler flat space-time characteristic of modern electromagnetic theory 
and special relativity. Already in 1912 Nordstrom had proposed a gravi¬ 
tational theory founded upon this type of space-time; 1 but his theory 
failed in that it did not account for the observed slight advance of the peri¬ 
helion of Mercury beyond the amount indicated by the classical Newtonian 
theory. 

My theory was obtained by laying down those demands which seemed 
most natural and elegant from the mathematical point of view. 



232 


MATHEMATICS: G. D. BERK HOFF 


Proc. N. A. S. 


In the present Note I wish to indicate in outline the modified account of 
matter, electricity and gravitation thus arrived at. The appropriate 
mathematical language is no longer that of tensors as in my two Notes of 
1927, but is that of 4-vectors. It should be emphasized that, from the 
mathematical and philosophical point of view, the new theory is very 

simple. 

I. Normal Co6rdinates in Flat Space-Time 
Let ds denote the element of local time so that 

ds 2 = dt 2 — dx 5 — dy 5 — dz 7 

where dt and dx, dy, dz refer to the usual time and space coordinates in 
seconds and ligjit-seconds. respectively. l[ now we replace /, x. y, z by 
x i - tiX 2 = V - \x, x* = V - ly. x 4 = >/^\z, this foi mula takes the form 

ds 5 = (dx*) 5 + (dx 5 ) 5 + (dx 5 ) 5 + (dx 4 )*. 

and the corresponding coordinates x* are called normal coordinates. In 
such normal coordinates the language of 4-vectors becomes the same as 
that of 3-vectors in ordinary space. For this reason it is possible to use 
subscripts throughout rather than the subscripts and superscripts charac¬ 
teristic of tensor theory. Thus we have 

ds 5 = dx- 5 (1) 

For brevity we shall use such normal coordinates almost exclusively. 


II. The Perfect Fluid 

By general consent the energy tensor of the homogeneous adiabatic 
fluid is written in essentially the form 

Ttt - pw i«i — p&tj (2) 


and, in terms of it, the equations of motion are written 


dx« 


(3) 


Here p and p = /(p) designate, respectively, the density and pressure of the 
fluid, tit is the velocity vector dxjds, 8 tJ is the usual Kronecker 8„, and /, 
stands for the body force vector per unit of volume. 

The perfect fluid is singled out by the further requirement that the dis¬ 
turbance velocity is to be that of light at all densities, with the correspond¬ 
ing equation of state, 

p = p/2. (4) 


921 



Vol. 29. 1943 MATHEMATICS: G D. HIRKHOFF 233 

Only with this type of fluid can essential mathematical difficulties be 
avoided at collision of portions of the fluid.* It is also assumed that there 
is an equilibrium density po corresponding to a cosmic pressure po/2. The 
precise value of this density nowhere enters, and it would be equally pos¬ 
sible to suppose that the equation of state has the form p = (p — po)/2 
and that the pressure is 0 at the free boundaries in the customary manner. 

We recall that the force vector f, is necessarily orthogonal to the velocity 
vector «| 

/„«. = 0. (5) 


III. The General Hypotheses on the Forces 

We propose now to make the following assumptions: (a) the force vector 
ft is rational and integral in the velocity components, of not higher than the 
second degree; and (6) the coefficients are homogeneous and linear in the 
first partial derivatives of the corresponding potentials, namely, the atomic 
potential \p which I introduced in 1927 (loc. cit.), the usual vector electro¬ 
magnetic potential and the symmetric gravitational tensor potential 
h ti defined in the present Note. We shall furthermore suppose that (c) 
there are no degenerate quadratic terms in the velocities, i.e., no terms re¬ 
ducing to terms independent of n, in view of the fundamental identity 
ua 7 = 1. These three types of potential seem appropriately designated, in¬ 
asmuch as they refer primarily to matter, to electricity and to gravitation, 
respectively. 


IV. The Atomic Potential ^ 


The components of the external forces acting on unit volume of the 
perfect fluid due to the atomic potential will, according to our general hy¬ 
pothesis above, be given by a vector of the form 


*«- 


d* 

dXa 


of degree 0 in the velocities. Since cV/dr, is itself a vector it follows that 
c tJ must be a (numerical) tensor, and it is immediately obvious that it can 
only be a multiple of S tJ . In fact if u t and v t are any two vectors, the associ¬ 
ated rational invariants are their squared lengths u* 2 , v«r and the cosine of 
the angle between them, (u«0«)/[(i<«*)(«te*)] >/a . This fact shows that 
CafittaVg must be cbafiiuiDfi. Hence c t , = cb tJ as was stated, and the force 
vector under consideration must be d^/c)x< up to a constant multiplier 
which may be absorbed into \p. 

But the condition (5) upon this component of f t obviously yields 


922 



MATHEMATICS: G. A. B/RKHOFF Pkoc. N. A. S. 

IW 1 ! 



(-4.) 


• c the atomic potential remains constant along the world line of any particle 
of the perfect fluid. It is assumed that vanishes along the free boundaries 

and in empty space. . . , . _ 

The formula for the corresponding atomic body force is, of course, 



(At) 


The priinordially given atomic potential \p supplies a useful mathematical 
instrument in the construction of a conceptual theory of matter and elec¬ 
tricity. Thus in 1927 (loc. cit.) I showed how the atomic potential might 
be used to obtain an atomic frequency equation which closely resembled the 
celebrated wave equation of Schrodinger. 

The model atom which I proposed was not positively unstable, although 
without rigidity. If it be required that the elementary constituents of 
matter, such as the proton, electron and neutron, cannot become locally 
concave this difficulty disappears (sec my Oslo paper). We would have to 
suppose then that, in the moment when such concavity tends to be pro¬ 
duced, there arises a tensional normal force at the surface just sufficient to 
prevent it. Under such a condition a closely packed set of elementary 
constituents would necessarily have polyhedral forms, and this property 
would seem to indicate a possibility of crystalline structure and rigidity. 
There would obviously be a tendency of such everywhere convex bodies to 
maintain a roughly spherical form under collisions and other strong dis¬ 
turbances. 


V. The Electromagnetic Potential v>« 

As indicated above, the electromagnetic force vector f Ei , which arises from 
the terms linear in the velocities, is to be a vector of the form 


C|o£X 


<>** 




Since it is our intention to identify the potential *>, with the usual electro¬ 
magnetic vector potential, we impose the condition 


d /<Vi _ <Va\ 

dx a \dx a dx,/ 


— 4ir<7K„ 


(£.) 


where o is the density of electricity. Now the only possible constituent 
terms in Sex are to be obtained from 


923 



Voi.. 29. 1943 


MATHEMATICS: C. P. hiRKHOFF 


23 r, 


dx. 




by a single contraction of indices and choice of a subscript i, in the three 
possible ways: 




u a . 


&<Pc 


u a , 




U t . 


dx a dx t dx t 

Consequently the force vector in question is of the form 

<><Pt , . ^Pm . „ &Pa 

a — u a + b — u a + c -— u 
bx a d.x, bx a 

But by the general requirement (5) this has to vanish for i = 1 when u x = 1, 
Ut — u% — = 0, i.e., 


(a 4- 6) —-h c —— 

dxi dx« 


0. 


Since there is no necessary relation between dy>i/dxi and we infer 

tlmt a + h = c « 0. I fence this electromagnetic force vector is essentially 

- • (£ - S5) - « 

since the pondcromotive force is proportional to the electrical densit/ a. 


VI. The Gravitational Potential h ti 
By analogy with the Poisson equation in classical gravitational theory, 
we shall assume that a similar equation holds in each separate component 
of the symmetric energy tensor T„ and the corresponding component h tf of 
the symmetric gravitational potential h tJ , namely, 

S - «' r * <°-> 

where the multiplier 8 tt is selected for reasons of convenience. This condi¬ 
tion (Gi) is evidently the simplest analogous equation from the formal point 
of view. 

According to our initial hypothesis the types of terms which may enter 
in the corresponding gravitational force vector f au arising from the quad¬ 
ratic terms in the velocities, are derived from 

u,u t 

<>X t 


924 



236 


MATHEMATICS: G . D. BIRKHOFF 


Proc. N. A. S. 


by a double contraction of indices and choice of the index i. This yields 
six possible types: 


dftfg 

bXfi 


U a Ufi, 


dhafi dhgfi 




u a u fi , 


dx. 


UpUu 


• UpUu 


dhu. 


dh. 


Up 2 . 


dxp ~ p -'' dx a 

But the last two types reduce respectively to d/i to /dx a and £>/»«,/&*< since 
, = 1 and so are of the degenerate type excluded by the hypothesis (c). 
Thus the most general available gravitational force vector is of the form 


« 25s «„«, + f. «.u, + c U,u, + d 55= 


dxp "" " ' Z>Xt 
But by (5) this must vanish for t 


dx a 

1 if Mi = l,Ut 


dxp 

M, = Ua 


0, so that 


.. Dh X \ bh* i dA aa 

< a + b ^ + c *r m +d -^-°- 


Since the quantities Dhn/bx>. dh.,/dx a and dh„/dx, are independent of 
one another we must have a + b - c - d = 0. This yields for the grav.- 
tational force vector essentially the following expression 

(Gt) 


fdhtm b , *ap\ 


where the factor p is introduced since the gravitational force is proportional 
to the density p. 


VII. The Complete Theory 

We have then an energy tensor T tJ given by (2) with p = p/2. The equa¬ 
tions of motion are given by (3) where 

/«“/*« + /*« + Sou 

The three terms on the right are defined by 04*), (Ea), (G a ), and arc of degrees 
0, 1, 2 in the velocities. Furthermore the equations (-4i). (Ei), (Gi) deter¬ 
mine, respectively, the atomic, electromagnetic and gravitational poten¬ 
tials involved. Thus we have 21 dependent variables, p. a, u u «Pi, h { „ 
and we have 20 equations in these variables, namely, the 4 equations of 
motion (2), the single equation (-4,), the 4 equations (Ei), the 
10 equations (Gi) and in addition the relation u a 2 - 1. This is as it 
should be since the electromagnetic potential <p t is only determined up to the 
gradient of an arbitrary function, d V/dx t . It is to be remembered that to 
empty space we take the atomic potential * and the energy tensor T t) in 
vanish. 


925 



Vol. 29. 1943 


MATHEMATICS: G. D. BIRKHOFF 


237 


This system of equations is complete and consistent as a mathematical 
embodiment of matter, electricity and gravitation. The corresponding 
system in the generalized theory of Einstein is incomplete to the extent 
that the equation of state of the homogeneous adiabatic fluid is not speci¬ 
fied, and it is inconsistent in that when two portions of the fluid collide, the 
equation of motion may break down. 

VIII. RfesuMfe of the New Gravitational Theory 

Let us now disengage as far as possible the new gravitational theory and 
consider to what extent its predictions are in agreement with the known 
facts. 

Matter is supposed to be either that special, mathematically satisfactory, 
homogeneous adiabatic fluid for which p = p/2 or, presumably, any form 
of matter in which the disturbance velocity is that of light under all circum¬ 
stances. 

We take T tJ to designate the energy tensor of matter and suppose that 
the equations of motion in the absence of a gravitational field (i.e., when 
only a small quantity of matter is present) may be written in the usual 
form: 

a*. fu 

where f t is a suitable force vector. It is then assumed that in the case of 
a gravitational field we may write the force vector /, in the form 

ft “ ft + fot, 

where f ot designates the gravitational force vector. 4 

We assume further that there is an associated symmetric gravitational 
tensor potential h t , such that a Poisson equation (Gi) holds in each com¬ 
ponent of h t t relative to T„. In empty space T if is taken to vanish. 

Under these circumstances f 0i cannot be independent of the velocities 
since a proper force vector must be orthogonal to the velocity vector. 
Moreover since gravitational theory is reversible in time (in contradistinc¬ 
tion to electromagnetic theory), and since in the classical theory the gravi¬ 
tational force components are given by the components of the gradient of 
the gravitational potential, we assume that the gravitational force vector is 
homogeneous and quadratic in the velocity components, and homogeneous 
and linear in the first derivatives of the components of the gravitational 
potential, and furthermore that none of its components are degenerate, i.e., 
involve a factor u a 2 = 1 . 

In this way we obtain for the gravitational force vector f Qi the unique 
expression (G 2 ). Thus our complete system of equations is 


926 



MATHEMATICS: G. D. BIRKIIOFF 


Proc. N. A. S. 


238 


ZTu. 

dx a 


= /i+ Sou 




a 




= 8 tT 


*j> 


where we regard f oi and T tJ as replaced by their explicit expressions. 

It is our intention in conclusion to indicate why this simple theory ot 
itation set in the framework of flat (i.e., classical electromagnetic) 
space-time is in agreement with the observed facts. 

IX. Gravitation in the Quasi-Station ary State 
Let us suppose first that the portions of the perfect fluid are moving at 
email velocities relative to some frame of reference so that we have approxi¬ 
mately Ui «= 1. ,H - «H - «4 = o. In this case we find in the coordinates 

*i = /,*’ = x, x* = y. * 4 - 2. 

/ a* , a* , e> 2 \ 

M t , = -4xp5o \ ^ d* 2 + dy 2 + df 2 / 


approximately so that h t , is negligible for t * ] while h it for t -= 1, 2. 3. 4 
reduce to the ordinary gravitaUonal potential g. The gravitational force 
vector per unit of mass then reduces to the gradient of g. Thus the theory 
is in first order agreement with the Newtonian theory. 


X. The Centrally Symmetric State 

Suppose next that we have a sphere of the perfect fluid at rest with its 
center at (*, y, z) = (0. 0. 0). Of course T t , and h tJ will then be independent 
of the time I, and our Poisson tensor equation reduces to the equation writ¬ 
ten above in the same coordinates so that we obtain for all t and j the exact 
equation outside of the sphere 

h,t — — i (r, radial distance) 


where m is the mass of the fluid sphere. 

Thus the three exact equations of motion for a particle at (*. y, z) at 
traded by the sphere are found to be of the type 


x 


w 


^ ^ (*'* + y n + s' J ) + 

r J » r a 


mx'r' 

r 2 


where the accent ' indicates differentiation as to s. It is to be observed 
that the first terms on the right yields the dominant Newtonian force com¬ 
ponents, the other two small terms being relativistic in origin. 

Now there is no essential restriction in assuming that the initial plane of 
the motion through (0. 0. 0) is the 2 -plane, whence we can at once conclude 
that 2 = 0 for all time. Hence we have only to solve the first two equations 


927 



Vol. 29. 1943 MATHEMATICS: G. D. HIRKHOFF 239 

with z = z\ = 0, r = /x 2 -f- y 7 . This is a readily integrable pair of equa¬ 
tions with x', y' and y, — x as two pairs of integrating factors. 

The differential equation of the path of the particle is seen to be 

it 2 ((^y+«’) = «”"■ ++ ° -1). 

where u = l/r and 0 is the longitude; h and C are arbitrary constants of in¬ 
tegration. This may be integrated by an obvious quadrature. 

It is thus readily established that the resultant formulas for the advance 
of perihelion of the particle (x, y, z) and the deviation of a ray of light 
(thought of as the path of a photon) in the gravitational field of the central 
sphere of matter are the same in their principal parts as in the Einstein 
theory. Furthermore the formula for the spectral shift toward the red is 
also in essential agreement with that theory. The exact expressions are, 
however, different. 5 

Thus the simple theory of gravitation here outlined seems well adapted to 
explain the known physical facts. Like the Einstein theory, it has the ad¬ 
vantage of involving no arbitrary constants whatsoever. However, it is 
essentially different in that it presupposes a framework of flat space-time 
instead of space-time curved by matter, and a basic form of matter in which 
the disturbance velocity is that of light. 

1 "A Theory of Matter and Electricity.” “The Hydrogen Atom and the Balmcr 
Formula,” these Prockedincs, vol. 13. 1927. See also my article, "The Foundations of 
Quantum Mechanics," in the Proceedings of the International Mathematical Congress at 
Oslo, 1 (1936). 

* “Relativit;itsprinzip und Gravitation." Physikalische Zeitschift, 13 (1912). 

* In a paper about to appear in the Revista de Ciencias de Lima, entitled "Sobrc cl 
Fhiido Perfecto,” I have shown by direct integration that at least in two-dimensional 
space-time, such difficulties do not arise with the perfect fluid. 

4 It is a mistake to believe that the Einstein theory of gravitation docs not similarly 
superimpose a gravitational force upon the other forces. It is only the mechanism of 
the superimposition which is different in the Einstein theory. In fact any physical 
theory without gravitation in flat space-time becomes one with gravitation in curved space- 
time when ordinary derivatives are replaced systematically by the covariant derivatives 
of the tensor calculus. 

* The details of the new gravitational theory will appear in my article "El couccplo 
de tiempo y la gravitacion" in the Proceedings of the Astrophysical Congress held at 
Puebla and Tonantzintla, Mexico, in February. 1942. 


928 



, , m T.IE SriENtinc Monthly. Part X, January. 19*4. Vol. LVIII pages 49-57. A eontribu- 
ItoprinteJ D r °" corge D. Birkhoff, Perkins Professor of Mathematics, Harvard University, Cambridge, Mm- 

snehusetts. 

NEWTONIAN AND OTHER FORMS OF 
GRAVITATIONAL THEORY* 

I. NEWTONIAN THEORY 

By GEORGE D. BIRKHOPP 


T T has been generally granted that Sir 
Isaac Newton’s work on univer^l gravita¬ 
tion constitutes an unsurpassed scientific 
achievement. To begin with, in order to de¬ 
velop his ideas concerning the nature of 
gravitation, Newton devised that necessary 
mathematical instrument of modern scientific 
thought, bis “fluxions,” now termed the cal¬ 
culus; then, with the aid of this powerful 
new tool, he developed the principal theo¬ 
retic consequences of his inverse square law; 
furthermore, he applied his theory to the 
most varied types of gravitational phe- 
ndmena with extraordinary skill and pene¬ 
tration; and, finally, after making certain 
that supposed observational discrepancies 
were disposed of. and after transferring bis 
analytical equations into the language of 
infinitesimal geometry current in his day. he 
puve out his great master work, the Pnncipia 
of 1687. Truly this four-fold accomplish¬ 
ment was most remarkable; and the unfold- 
ment of the program for the physical sciences 
which Newton thus initiated has continued 
to dominate the field of physics during nearly 
all of the two and a half centuries which have 
since elapsed; 

Naturo and Nature *• law* lay hid in night. 

Ood a aid “Let Newton be," and all wa» light. 

In this tercentenary year (1942) of New¬ 
ton’s birth it is therefore natural and appro¬ 
priate to estimate his theory of gravitation 
in its relation to the new forms of gravita¬ 
tional theory which have developed in the 
last four decades. This I shall endeavor to 
do in what follows, avoiding the nse of tech¬ 
nical mathematical terms except in a purely 
descriptive way. Likewise when simple 
equations are written down in a few cases, it 

•From the symposium on "Natural Philosophy" 
commemorating the 300tb anniversary of Newton’s 
birth which was to have been presented at the New 
York meeting of the American Association for the 
Advancement of Science. 


is only to show the obviously close formal 
analogy between the Newtonian theory and 
its modem counterparts. 

It should be remarked first of all that not 
until after the celebrated Michelson-Morley 
experiments of 1887 did it begin to be real¬ 
ized that matter was astonishingly indifferent 
to motion through the ether. Between New¬ 
ton’s discovery and these experiments there 
had elapsed more than two hundred years. 

In this period physicists and astronomers did 
little more than consider the following ques¬ 
tions concerning gravitation: la it not pos¬ 
sible that gravitation travels with a finite 
velocity instead of with infinite velocity as 
demanded by Newton’s theory! Is it not 
possible that an exponent slightly different 
from the exponent 2. appearing in his law, 
might improve the theory! Are or are not 
“inertial” mass and “gravitational” mass 
exactly the same, as was postulated by New- 
ton! The careful study of these three ques¬ 
tions had only served to confirm the New¬ 
tonian theory. 

Aside from this specific work, one fact, 
however, had emerged: A very close exami¬ 
nation of the phenomena of the solar system 
by theoretical astronomers, on the basis of 
the Newtonian law. had shown that there was 
a slight but definite excess of perihelial ad¬ 
vance of the planet Mercury amounting to 
41" per century. Since the velocity of Mer¬ 
cury is large as compared with the velocities 
of the other planets, it was natural to conjec¬ 
ture that here was beginning to appear a 
deviation from the Newtonian law which 
should become more and more marked as the 
velocity approached that of light, which is 
the fundamental limiting velocity in an elec¬ 
tromagnetic universe. 

The analysis here undertaken will begin 
with some simple mathematical observations 
and then pass on to give an outline of the 
Newtonian theory (part I) and of recent 


49 


929 



50 


THE SCIENTIFIC MONTHLY 


relativistic theories of gravitation (part II). 
My approach will in no sense be historical. 1 
Instead, it will present a comparative analy¬ 
sis of the various forms of gravitational the¬ 
ory. The Newtonian theory of gravitation 
may be likened to the opening movement of 
a great scientific symphony, expressed in the 
natural language of everyday geometrical 
and temporal intuition, namely that of 3-vec¬ 
tors. There follows an analogous second 
movement formulated in the terminology 
of 4-vectors, appropriate to 4-dimensional 
“astro-electric” space-time. In this, Nord¬ 
strom's gravitational theory of 1912 appears 
as the natural development, although ending 
in an unresolved discord because certain 
slight gravitational phenomena, like that ob¬ 
served in the motion of Mercury, are not 
accounted for. Then there follows immedi¬ 
ately the very brilliant, chaotic, and incom¬ 
plete third movement formed by the gravi¬ 
tational theory of Einstein of 1916. Here 
the calculus of tensors, appropriate to space- 
time curved by matter, is employed through¬ 
out. In this way the discord in the second 
movement is resolved and the predictive 
power of the new theory is established. 

Now it has been generally agreed that (to 
carry our musical comparison a step further) 
there is need for an appropriate coda. Weyl, 
Kaluza and others have shown that there 
exist interesting possibilities in the way of 
providing a unified account of electromag¬ 
netic and gravitational phenomena. To 
these I shall refer only briefly. But I shall 
characterize more in detail a recent attempt 
of a fundamentally different type, which I 
presented at the Astrophysical Congress 
which met at Puebla and Tonantzintla, Mex¬ 
ico, in February, 1942.* My theory, like that 
of Nordstrom, is based on the simpler Lor- 
entz-Minkowski space-time of electromagnet¬ 
ism. 

It is entirely too early to say as yet how 
any of these various theories will be regarded 
in the future. One may well doubt whether 
any known physical theories will have final 

> For an extremely interesting historical and scien¬ 
tific evaluation of Newton see E. T. Bell, "Newton 
After Three Centuries,’’ American Mathematical 
Monthly, November, 1942. 

a To be published in tho Proceedings of the Con¬ 
gress. 


validity for the comprehension of Nature. 
We can be quite certain, however, that the 
classical Newtonian theory of gravitation 
will always remain serviceable to the theo¬ 
retical astronomer. Indeed, from this point 
of view, a more complicated gravitational 
theory, like that based on a curved space- 
time, can only operate as an “auxiliary con¬ 
struct” (to use Larmor’s characterization), 
taking into account a few minute relativistic 
gravitational effects almost beyond the reach 
of observation. 

Of course, it is impossible to understand 
and appreciate adequately the magnificent 
work of Newton without proper historical 
perspective. It is easy but misleading to 
magnify his position at the expense of his 
contemporaries and predecessors. Perhaps 
the half dozen figures of greatest import in 
the Newtonian background are those of 
Archimedes, Galileo, Kepler, and Descartes 
among his predecessors, and, among contem¬ 
poraries, his illustrious friend, the “Summus 
Hugenius,” and his yet greater rival, the 
mathematician and philosopher Leibniz. 
Furthermore it must be admitted that both 
the calculus and the classic theory of gravi¬ 
tation lay quite near at hand when Newton 
found them. This is evidenced by the fact 
that Leibniz was a co-inventor of the calcu¬ 
lus, and by the prevalent discussion of a pos¬ 
sible force of attraction inversely propor¬ 
tional to the square of the distance. 

Our point of view will be mathematical 
rather than physical, in the following sense. 
The physicist as such disclaims interest in 
any theory which is not in accord with 
Nature, even though it has achieved a con¬ 
siderable degree of success. At each moment 
he is hoping to discover the ultimate, all- 
embracing theory, after which he expects to 
abandon forthwith the earlier partial ex¬ 
planations. On the other hand, the mathe¬ 
matician studies freely all forms of theory 
which possess a certain esthetic-mathematical 
quality. Thus he is not only interested in 
ordinary real numbers, but invents imagi¬ 
nary numbers and a host of other types of 
number which interest and fascinate him as 
objects of abstract thought. Often he finds 
later on that these generalizations and modi¬ 
fications turn out to be essential for the 
understanding of natural law. Thus, to take 


930 



NEWTONIAN GRAVITATIONAL THEORY 


51 


a recent striking example, in quantum me¬ 
chanics it has been necessary to deal with 
“matrices,” which constitute a kind of gen¬ 
eralized number previously studied in detail 
by mathematicians for their own sake. Like- 
» wise mathematicians have studied many 
forms of geometry, and in particular the 
general n-dimeusional Riemannian geometry 
of curved spaces; and only by use of this 
theory and the associated mathematical the¬ 
ory of tensors of Ricci and Levi-Civita was 
it possible to develop the consequences of the 
“equivalence hypothesis” which lies at the 
basis of the generalized theory of gravitation 
of Einstein. 

In what follows, then, the esthetic-mathe¬ 
matical point of view will be taken through¬ 
out. Moreover, we shall not venture to 
conjecture what the ultimate account of 
gravitation is going to be, but rather we shall 
try to coordinate and appreciate the formal 
structures of various gravitational theories, 
as well as to assess their serviceability in 
physics. 

The Use of Models in Mathematics and 
Physics. From the beginning of bis scientific 
thinking, man has progressed by means of 
conceptual models taken from Nature. In 
fact, from a certain point of view, mathe¬ 
matics itself, crudely defined as the study of 
number and form, takes its origin in this 
characteristic way. For, the simplest type 
of physical universe is one thought of as 
made up of classes of distinguishable objects, 
oonsidered without regard to their specific 
properties but only as comparable with one 
another by the process of matching or one-to- 
one correspondence, as when the fingers of 
one hand are matched with those of the other. 
Through experience with this simple universe 
of classes, the concepts of logic and number 
arise irresistibly. Likewise the type of ideal¬ 
ized universe in which there are rigid bodies 
comparable with one another by direct super¬ 
position leads us inevitably to the concepts 
of Euclidean geometry. 

In both of these simple conceptual models 
so fundamental for mathematics, the comple¬ 
mentary processes of Analysis and Synthesis, 
which Newton insisted were necessary for 
mathematics and physics alike, are obviously 
present. The intimate intermixture of these 
two types of processes in his daily experience 


with number and form has led man to assign 
to these ideas an absolute and eternal valid¬ 
ity not readily granted to other conceptual 
ideas. Although we no longer think of 
Euclidean geometry as incorporated exactly 
in real space, nevertheless Logic, Number, 
and Geometry are firmly established as basic 
theoretical constructs of the human mind. 

The point of view will be taken iu the pres¬ 
ent paper that conceptual models are likely 
to continue to play a fundamental role in 
the development of theoretical physics, de¬ 
spite the apparent abandonment of such 
models in recent quantum-mechanical ad¬ 
vances. All of the gravitational theories to 
be considered here rest upon the basic model 
of an underlying space-time continuum of 
four dimensions, whose "points” correspond 
to events. 

The Role of Postulates. It was because the 
Newtonian Law of Gravitation was so deeply 
consonant with the simple intuitive ideas of 
space and time, as well as with all the known 
physical facta of his day. that Newton was 
able to affirm “Et hypotheses non fntjo ”—I 
do not frame hypotheses! Leibniz had a 
similar feeling. Said Leibniz. "Far from 
approving the acceptance of doubtful prin¬ 
ciples, I would have people seek even the 
demonstrations of the axioms of Euclid.” 

From the more sophisticated logical point 
of view of the present day, it would be con¬ 
sidered. however, that all physical and mathe¬ 
matical theories are necessarily built upon 
certain hypotheses, or "postulates,” which 
need to be carefully stated. In Newton's 
Principia his celebrated gravitational hy¬ 
pothesis is introduced through an analysis 
of known physical facts, in particular of the 
consequences of Kepler's planetary laws of 
elliptic motion. We may synopsize the set 
of postulates which Newton tacitly employed 
by saying that he accepted the workaday 
conceptual ideas of Absolute (Euclidean) 
Space and Absolute Time, although he saw 
no reason to distinguish between the Absolute 
Space and any other space in uniform trans- 
latory motion with respect to it. nor to fix 
upon theoretic units of space or time. 

This means for us today that all of New¬ 
ton's concepts were such as to he best ex¬ 
pressed in the mathematical language of 
3-vectors (that is. vectors with three corn- 


931 



THE SCIENTIFIC MONTHLY 


52 

ponents). For example, a velocity u is a 
directed quantity of this vectorial type, since 
to determine it we need only specify a 
directed line or vector, showing the direction 
and magnitude of the velocity in ordinary 
three-dimensional space. 

In what follows we shall see how the Nord¬ 
strom theory referred to above is highly 
analogous to the Newtonian theory, except 
that the language has been changed to that 
of 4-vectors. Furthermore, we shall find 
likewise that the generalized theory of Ein¬ 
stein is highly similar in structure. Here 
the appropriate language is that of tensors. 

Groups, Invariants, and .Mathematical 
Language. For the deeper understanding 
of all these theories, it is necessary 7 to say 
something about the mathematical concept 
of a “group of operations.” If any two or 
more of the operations can be combined into 
a single resultant operation of the same type, 
and if there always exists an operation of the 
set which undoes what another operation 
does, then the collection of operations is said 
to form a “group.” It is furthermore found 
to be convenient to regard the operation 
which does nothing, called the “identity” 
operation, as an element I of the group. 
Thus the postulates for such a group are 
essentially the following in concise form: 

I. ( AB)C*A(BC ) for any A, B, C, 

II. AI = 1A = A for any A, 

ITI. AX = B baa a unique eolation X tor any A, B. 

The table showing the result of combining 
any two of these operations constitutes the 
so-called “multiplication table” of the group. 

For example, if we consider a square lying 
on a table and the operations of rotating the 
square into itself upon the table, we find that 
there are four operations: the identity I; a 
rotation through 9C°, A; a rotation through 
180°, B; and a rotation through 270°, C. 
The corresponding multiplication table is ob¬ 
viously the following: 




Another very elementary example is fur¬ 
nished by the operations of adding an integer 
(to a number). Here the identity operation 
I is that of adding zero. 

Now in the geometric background of Eu¬ 
clidean space, the group of rigid motions has 
always entered intuitively. It will be re¬ 
called that Euclid accepted the proof of 
geometric theorems by the method of direct 
superposition. This really meant that he 
was accepting the group of motions as valid 
in elementary geometry. The natural analy¬ 
tic lauguage in which such geometric ideas 
are appropriately couched is that of 3-vectors 
already referred to above. 

In this mathematical symbolism certain 
simple processes turn out to be fundamental. 
The simplest of these is the multiplication 
of a vector by a constant which changes the 
.magnitude of the vector by this factor with¬ 
out modifying its direction. Another simple 
process is that of vector addition portrayed 
in the following figure. Still another impor¬ 



tant idea is that of the “scalar product” of 
two vectors; if these vectors are f and g, this 
would be indicated by f • g. A “scalar” 
quantity is one, like that of mass, which is 
essentially a magnitude without direction. 
The simplest types of quantities in vector 
theory are vectors and scalars. More com¬ 
plicated but fundamental types are those 
termed bivectors, dyadics, triadics, etc., by 
Gibbs. The simplest vector operations, in¬ 
volving the calculus, are indicated by div 
(read divergence), grad (read gradient) and 
curl. We shall not endeavor here to define 
these basic operations. 

Two extremely important related concepts 
in the theory of groups are those of invari¬ 
able properties and of invariants. In the 
illustration above of the group of rotations 
taking a square into itself, an invariable 
property is that of the adjacency or opposite¬ 
ness of sides and vertices: evidently adja¬ 
cency and oppositeness are qualities not 
affected by performing any operation of the 
group. Similarly, in the case of the additive 


932 




NEWTONIAN GRAVITATIONAL THEORY 


53 


croup of integers, the difference of two num- 
bers, i-y> expresses an invariant in the tech¬ 
nical sense, because this difference is not 
affected by adding the same integer to both 
x and !/• 

In the case of the group of motions of 
Euclidean geometry, the most fundamental 
variant is the distance between two points, 
which is not affected by any motion whatso¬ 
ever. Rieruann showed how this concept 
of distance appropriately extended lay at 
the foundation of n-dimensional geometry, 
whether flat or curved. A somewhat less fun¬ 
damental concept, but still having great im¬ 
portance, is that of the angle between two 
lines or vectors. It may be shown that in 
general the invariants of any group deter¬ 
mine the group completely. 

Newton really accepts, as the underlying 
group in his physical theories, this group of 
rigid motions augmented by uniform trans- 
latory motions in time. This imposes a tacit 
esthetic requirement upon the whole New- 
Ionian development. 

The simplest illustration of the expression 
of a Newtonian law in vector terms is that 
conveyed in the familiar statement that the 
resultant of two forces acting upon a point 
is the vector sum of the two constituent 
forces. 

It is also true that in the later gravita¬ 
tional theories there exists a specific underly¬ 
ing group which to a large extent predeter¬ 
mines the form which the theory may take. 
It is for this reason that the comparative 
study of gravitational theories is possible. 

On Mathematical Consistency. Physicists 
have never worried much about the question 
of mathematical consistency. For example, 
in the heyday of classical physics, solids were 
thought of as having a possible type called 
the perfect elastic solid. The articles and 
treatises concerned with such elastic bodies 
never considered the fact that the mathe¬ 
matical theory itself would fail under certain 
conditions. Thus, suppose that two perfectly 
elastic spheres were to collide with one an¬ 
other along their line of centers with a rela¬ 
tive velocity greater than twice that of the 
disturbance velocity (that is, the “velocity of 
sound”) in the medium. Then it is true that 
the basic differential equations of motion 
themselves become completely useless and the 


theory of elasticity breaks down. But this 
fact was hardly mentioned as of any interest 
whatsoc er. The whole history of physics up 
to the present day shows a disregard of this 
question of mathematical consistency. 

Now it is characteristic of the mathemati¬ 
cal point of view here advanced that such 
inner consistency is held to be a sine qua non 
in a successful theory. In what follows I 
shall keep this requirement in the fore¬ 
ground, although doing so introduces consid¬ 
erations which most theoretical physicists 
would consider as of little importance. 

Mysticism in Physical and. Mathematical 
Thought. The developments of the last fifty 
years in mathematics and physics have in¬ 
volved in ever-increasing measure semi- 
inystical considerations growing out of a 
highly developed sensitivity as to the role 
of formalism. It is well to say something 
about these matters before taking up the con¬ 
sideration of Newtonian and other gravita¬ 
tional theories. 

From Newton onwards, theorists in the 
field of physical thought have tended towards 
opinions of an intuitive type which constitute 
an essential directive clement in their crea¬ 
tive work. In fact, it seems to be true that 
long-continued and intensive study in any 
scientific field always leads to vague ideas 
which are felt to be of basic importance 
for deeper understanding. The concept of 
energy was initially of this type. Many 
other instances might be given. 

An important mystical idea of this kind 
was that the laws of nature can always be 
formulated by means of some “Variational 
Principle” or “Principle of Least Action.” 
For example, it was found that a ray of light 
passing through a medium with variable in¬ 
dex of refraction follows the path of least 
length in time in going from a point A to a 
point B of the body. To an astonishing 
degree it was discovered that dynamics and 
classical electrodynamics admit of very con¬ 
densed mathematical expression by means of 
an appropriate variational principle. Even 
today the physical theorist likes to show that 
a new theory can be expressed in variational 
form. 

As a mathematician I would like to point 
out that the significance of such a principle 
is not what it is often taken to be, for tbe fol- 



54 


THE SCIENTIFIC MONTHLY 


lowing reasons: In formulating a mathemati¬ 
cal problem there is always a large degree of 
freedom of choice, both in choosing the inde¬ 
pendent and the dependent variables. There 
are likewise a great variety of ways of com¬ 
bining a system of equations into equivalent 
systems. Because of these facts, it is not sur¬ 
prising that one can manage to obtain a 
variational principle appropriate to almost 
any physical or mathematical theory. It is 
certain at least that theories formally rever¬ 
sible in time, such as gravitational theories, 
will yield to expression in this variational 
form.* Thus variational methods need to be 
carefully scrutinized. 

Another important principle which has 
been effectively used by the physicist is the 
Principle of Sufficient Reason, which lies at 
the center of Leibniz’s philosophical specula¬ 
tions. As I have tried to show elsewhere, 4 
this principle is closely related to the theory 
of groups, and its significance may be con¬ 
veyed in the following symbolic diagram : 

Principle of Theory of 
Motaphysic**-^Sufficient*-►Ambiguity. 

Rcaaon Group* 

Perhaps one elementary but typical illus¬ 
tration of the use of this Leibnizian principle 
in physics may be mentioned. Suppose that 
two equal forces act upon a point. Because 
of the ambiguity associated with the relevant 
group of motions, it is clear that the resultant 
force must not only lie in the plane of the 
lines of the two forces but necessarily falls 
along the bisector of the angle which they 
form. 

It is such principles as the Principle of 
Least Action and the Principle of Sufficient 
Reason which have led many of the foremost 
physicists to adopt a somewhat mystical atti¬ 
tude toward the physical universe. There is 
no doubt that a large part of speculative 
physics up to the latest period can be con¬ 
veniently interpreted in terms of these prin¬ 
ciples. If we conjecture with Plato that 
the Deity continually georaetrizes, it seems 
almost certain that the language of Deity 
will involve the theory of groups and the 
corresponding Principle of Sufficient Reason 
» 8ee my Dynamical Systems (New York, 1927), in 
particular Chapter IV. 

4 The Principle of Sufficient Reason, The Rice In¬ 
stitute Pamphlet, Jan., 1941, pp. 24-50. 


on the one hand, and Variational Principles 
on the other! 

An extreme but characteristic expression 
of the mystical attitude towards physical 
thought, and of confidence in the unlimited 
power of the mathematical symbol, is that 
of Eddington when he says in his Relativity 
Theory of Protons and Electrons (1936) : 

Unless the structure of the nucleus has a surprise in 
store for us, the conclusion seems plain—there is 
nothing in the whole system of laws of physic* that 
can not be deduced unambiguously from epistemo¬ 
logical considerations. An intelligence, unacquainted 
with our universe but acquainted with the system of 
thought by which the human mind interpret* to iteelf 
the content of it* sensory experience, should be able 
to attain all the knowledge of phyaica that wo have 
attained by experiment. . . . For example, he would 
infer the existence and properties of radium, but not 
the dimension* of the earth. 

Perhaps such ideas, which are held by 
nearly all physicists in one form or another, 
merely indicate a belief that all physical 
theories are ultimately expressible in simple 
unitary mathematical terms. 

In what follows an endeavor will be made 
to state certain vague ideas concerning the 
differing gravitational theories, which are 
important for their philosophic evaluation. 

The Framework Based on the Rigid Body 
and Ordinary Time. The Newtonian theory 
starts out with the framework of space and 
time suggested by daily physical experience. 
All about us there are rigid bodies which 
need to be compared and measured. In this 
way we arrive at the concepts of geometry, 
and of a space at first attached to the earth, 
and later on attached to the fixed stars. 

Likewise, the notion of time as measured 
by clocks becomes more and more definitely 
established on an intuitive basis. Events are 
thought of as happening when seen, so that 
the concept of absolute simultaneity is firmly 
established. 

These ideas of space and of time are em¬ 
bodied in the concepts of Euclidean space 
and of ordinary time. In this space the point 
of reference, the direction of the axes of ref¬ 
erence, and the specific units of distance play 
no essential part; and likewise, the choice of 
the instant of time, called the epoch, from 
which time is measured and of the unit of 
time has no special importance. Further¬ 
more, as was noted previously, there is an 


934 



NEWTONIAN GRAVITATIONAL THEORY 55 


additional degree of relativity in that any 
system moving at uniform velocity of trans¬ 
lation with respect to the system of reference 
is regarded os a valid reference system. 

The Underlying Group and its Two In- 
variants. There would be no object in speci¬ 
fying here in symbolic form exactly what the 
corresponding group is; it suffices merely to 
say that if we have given any one set of space 
coordinates x, y, *. and time coordinate t, 
there is a large variety of other sets of space 
coordinates x„ y„ *, and time coordinate f, 
which are equivalent in a physical sense to 
the given system of coordinates, namely the 
6e ts attached to other systems relatively at 
rest or in uniform translatory motion. 

The fundamental invariants of this New¬ 
tonian group are two in number: the distance 
between two points at the same time; and the 
interval of time between two events. In fact, 
the Newtonian group is precisely the most 
general group which leaves these two quan¬ 
tities invariant. 

The Corresponding Language of 3-Vectors. 
As has been noted before, the language of 
3 -vcctors is that which is appropriate to this 
type of group, but there are certain restric¬ 
tions of which at least one should be noted: 
To say that two vectors are equal is evidently 
an iuvariuntive statement, since if two vec¬ 
tors are equal in any one coordinate system, 
they will be equal with reference to any other 
system whether relatively at rest or in uni¬ 
form translatory motion. On the other hand, 
the vector velocity has no invariantive char¬ 
acteristics, since the vectors with respect to 
one system may not be the same as with ref¬ 
erence to another system. This circumstance 
explains why it is that velocities never appear 
directly in the Newtonian mechanical and 
gravitational theories. However, the differ¬ 
ence of two velocities is invariant in charac¬ 
ter, as may be seen from the following figure. 



Here the difference of two vectors v, and v a , 


is indicated, first, with respect to the given 
system of reference and, secondly, with re¬ 
spect to a moving system impressed with an 
additional velocity v. It is seen that the 
final difference w is the same in both cases. 
Now the concept of acceleration is essentially 
one involving differences of velocities, and 
thus is explained the extraordinary impor¬ 
tance of the concept of acceleration and its 
nearly equivalent concept of force in the 
dynamical theories of Newton. 

More explicitly, the force exerted on a 
body is measured by the product of its mass 
and its (vector) acceleration, that is, by M 
du/dt where M is a scalar invariant called 
the mass and du/dt is the notation of the 
calculus for the (vector) rate of change of 
the vector velocity u along the path of the 
mass in question. 

The Particle Model of Newtonian Gravita¬ 
tion. Imagine now a system of any number 
of mass particles in otherwise empty space. 
For simplicity, we may consider first the case 
of only two particles of given masses m, and 

«i. 

The Newtonian law states that the force 
which cither mass exerts on the other acta 
along the line joining the two particles and 
is inversely proportional to the square of the 
distance between them; in absolute gravita¬ 
tional units of mass the force is precisely 
equal to the product of the masses divided 
by the square of the distance. We always 
employ such absolute units in what follows, 

It is evident that this formulation is in 
accord with the underlying Newtonian group 
and is as simple a formulation as can be con¬ 
ceived if the force is to tend to disappear os 
the distance between the two particles in¬ 
creases indefinitely. It is true that it is natu¬ 
ral to consider the exponent 1 as well as 2. 
But the analysis by Newton of the known 
Keplerian laws of motion in two body motion 
showed that the exponent must be 2. 

When more than two particles are present 
the resultant force on any one of them is 
simply taken as the vector sum of the forces 
of attraction of all the other mass particles. 

This idealized model of the solar system 
has explained practically completely the 
facts of observational astronomy and pre¬ 
dicts correctly the future motion of heavenly 
bodies to a remarkable extent. Laplace and 


935 



66 


THE SCIENTIFIC MONTHLY 


other eminent later astronomers have verified 
the Newtonian theory in more and more 
detail. 

The "Cosmic Dust” Model. In order to 
explain gravitationally the flattening of the 
earth at the poles, tidal motion, etc., it is 
necessary to extend the particle model so as 
to embrace rigid, elastic, and fluid bodies. 
Newton began this process of extension by 
proving that homogeneous rigid spheres 
would attract one another according to his 
theory exactly as though their masses were 
concentrated at their centers. By proving 
this basic theorem, he was able to reduce this 
model to that of the simpler particle model. 
There is no theoretic difficulty in applying 
the Newtonian theory to any given type of 
matter, since, according to Newton’s law, the 
known gravitational forces are merely super¬ 
imposed on the other forces. 

Unfortunately, the rigid body model is 
absolutely out of place in a relativistic theory 
of gravitation. A simpler type of model 
which is serviceable in other cases, however, 
is that of inchoate matter formed by cosmic 
dust. The state of such cosmic dust is 
thought of as characterized by its density 
and vector velocity at each point, each of the 
particles being attracted towards all the 
other particles in accordance with the New¬ 
tonian law. This model is especially con¬ 
venient since it enables one to compare 
directly the recent relativistic gravitational 
theories with that of Newton. 

With this cosmic dust model we have only 
to state the following two requirements of 
Newtonian gravitation in mathematical form, 
using the appropriate language of 3-vectors: 
mass is conserved; the acceleration of each 
point is in accordance with the limiting form 
of gravitational law obtained by passing from 
the case of many small particles to the limit¬ 
ing case of a continuous distribution of mat¬ 
ter. These turn out to be fully expressible in 
the following abbreviated form: 


dp 

— ♦ div pv = 0 , 



= B. grad 


0 - 


div grad g =-4rip. 


Here p designates the density, u stands for 
the vector velocity, and n is the familiar ratio 
of the circumference of a circle to its diame¬ 
ter. The last written equation is called Pois¬ 
son’s equation, and the function g which 
enters is called the gravitational potential 
and is required to be zero at infinity. 

These equations are written down because 
they serve to show how the appropriate lan¬ 
guage of the calculus and of 3-vectors yields 
the essence of a grandiose and extensive the¬ 
ory in extraordinarily abbreviated symbolic 
form. It will be especially instructive as we 
proceed to compare visually this symbolic 
form with the related symbolic forms of the 
other theories of gravitation. 

Physical, Philosophical, and Mathematical 
Difficulties. Despite the tremendous suc¬ 
cesses scored by the Newtonian theory, there 
are certain inherent difficulties which remain 
to be mentioned. 

Firstly, as noted above, there are delicate 
gravitational effects, just within observa¬ 
tional range, which the Newtonian law of 
gravitation does not account for. 

Secondly, ce-tain natural philosophical re¬ 
quirements are violated by the theory. For 
example, this law assumes that gravitational 
forces are transmitted instantly, whereas 
elsewhere in nature it appears that inter¬ 
action between distant bodies is always 
propagated with finite velocity across the 
intervening space. Newton himself was 
aware of this grave difficulty of his theory. 

Furthermore, gravitational forces are de¬ 
clared to be merely superadded to the other 
natural forces. Thus gravitation appears as 
a kind of afterthought on the part of the 
Creator! 

Moreover, according to this theory, there 
might be a single rotating body. From the 
philosophical point of view, however, it ap¬ 
pears unreasonable to think that, if there 
were only a single body in the universe, it 
could be rotating. For with respect to what 
would it rotate t Nevertheless, according to 
Newtonian theory this could be the case. 

In what follows we shall merely mention 
similar general philosophical comments. The 
following fact should always be borne in 
mind in this connection: There are many 
plausible philosophical demands, and yet no 


936 




NEWTONIAN GRAVITATIONAL THEORY 


57 


conceivable theory can satisfy them all since 
they are often mutually contradictory. 

Thus, for example, it is natural, on the 
one hand, to suppose that matter somehow 
conditions space. But it also appears almost 
inevitable to think of events as transpiring 
in an invariable framework of space and 

time. , . .. . 

Under these circumstances it is wise to 

preserve a certain humility of spirit and not 
to insist too much upon specific philosophic 
requirements! 

There are also some mathematical difficul¬ 
ties in the Newtonian theory, in particular 
as applied to the “particle” or the "cosmic 
dust" model. 

In the case of the model based on particles, 
the difficulty arises at collision. As long as 
only two of the mass points collide, it is pos¬ 
sible to determine mathematically the sub¬ 
sequent development of the system in a 
unique and natural manner. However, it 
appears that when three or more particles 
collide simultaneously, there is a mathemati¬ 
cal indeterminateness in the subsequent 
motion. Such indeterminateness seems ob¬ 
jectionable in a physical theory based on the 
concept of causation. From this point of 
view the particle model has an unquestion¬ 
able defect. 

Likewise the model based on the type of 
matter called cosmic dust presents its own 
special difficulties. In fact a dust cloud will 
from time to time overlap itself, when differ¬ 
ent parts interpenetrate; this would not take 


place if there existed elastic pressure. Even 
worse, a three-dimensional cloud of dust may 
be turned inside out as time elapses: Suppose 
that in a spherical cloud all of the points are 
moving towards the center with a velocity 
proportional to the distance from the center. 
The cloud would then condense to a point at 
a certain instant and continue as an expand¬ 
ing spherical cloud, but “inside out.” 

If, however, we introduce a homogeneous 
adiabatic fluid or gas. with a pressure p and 
density p which are functionally related as 
in a perfect gas, there will still exist a similar 
possibility of indeterminateness, namely 
when two portions collide at a relative veloc¬ 
ity more than twice the disturbance velocity. 

There is, however, an artificial kind of 
mass particle which avoids these difficulties. 
If we suppose that the force of interaction 
between two particles is one of mutual re¬ 
pulsion at small distances, then collision may 
be impossible. This would happen if, for 
example, besides the Newtonian force of 
attraction proportional to the inverse square 
of the distance, there were a further force of 
repulsion proportional to the inverse cube of 
the distance. 

Unfortunately, with this type of model, 
instead of the excessive perihelial advance 
which is found in the case of Mercury there 
would be a regression. Furthermore the 
modified theory would be definitely less ele¬ 
gant than that of Newton, since the particles 
would no longer be characterized physically 
by their mass alone. 


937 



Reprinted from The 8ciENTinc Monthly, February, 1944, Vol. LV1II. 

pages 135-140. 


NEWTONIAN AND OTHER FORMS OF 
GRAVITATIONAL THEORY 
II. RELATIVISTIC THEORIES 

By GEORGE D. BIRKHOFF 


The Larmor-Lorentz-Einstein Framework 
of Space-Time. If an impartial observer, 
supplied with an astronomical clock and a 
telescope but with no other means of physical 
observation, were to make observations on 
the heavens, he would discover a universe in 
which all of the stars appeared on an equal 
basis. He would observe that in his indi¬ 
vidual space and time, the laws of geometry 
hold, but he would grant that exactly the 
same laws must hold equally well for any 
other observer. He would of course realize 
that light travels with respect to him at a 
velocity which he might well take to be unit 
velocity. But he would expect light to travel 
in the same fashion relative to the space of 
an observer on another star. In this way he 
would be led without logical difficulty to a 
certain space-time framework in which the 
notion of simultaneity has no longer the same 
significance for all observers; in particular, 
events simultaneous for the observer on star 
A would not be simultaneous for an observer 
on star B, unless A and B are relatively at 
rest. His final unavoidable mathematical 
conclusion would be the following: It is pos¬ 
sible to set the observations of other observers 
in agreement with his own by a very simple 
rule: the clock of any other observer ap pears 
to go at a slower rate in the ratio of \/l - v* 
to 1, and distances in the line of motion are 
contracted in the same ratio. Here v is the 
relative velocity of the other star. 

Working from electromagnetic considera¬ 
tions alone, Lorentz was led to assume about 
1892 that such a spatial contraction takes 
place in moving electromagnetic systems. 
Later on, Larmor and he recognized more 
explicitly the necessity for a similar time con¬ 
traction. By this means not only were the 
null effects of the Michelson-Morley experi¬ 
ments fully explained, but a number of other 
null effects as well. 

We may say, then, that the space-time 
appropriate to the stellar universe and to 
electrodynamics is this type of “astro-elec¬ 


tric” space-time. However, while Larmor 
and Lorentz saw in the situation a certain 
disagreeable indeterminacy in the specifica¬ 
tion of the underlying ether, Einstein was 
led (1905) to affirm positively that from the 
very nature of physical law it is impossible 
to single out one of these systems of reference 
as being at rest. This conclusion would ap¬ 
pear entirely natural to the impartial stellar 
observer referred to above. 

The Lorentz Group. The theory so ob¬ 
tained constitutes Einstein’s special theory 
of relativity and amounts to a change from 
the Newtonian group of motions to what we 
shall call the Lorentz group, although it 
might with equal propriety be called the 
Larmor-Lorentz group. In this type of 
framework of reference we have a homogene¬ 
ous four-dimensional framework in which one 
coordinate is “time-like” and the other three 
are “space-like,” but nevertheless space and 
time are fundamentally commingled. The 
two invariants of distance and interval of 
time of the Newtonian framework are now 
replaced by a single invariant termed by 
Lorentz the “local time,” which may be writ¬ 
ten in the symmetric form 

ds' = dt' -dx'- dy' - dz\ 

The language of 4-vectors is completely 
appropriate to this type of space-time, and 
the mystic equation holds: 

1 second = 186,300 miles. 

Today every well-informed physical theo¬ 
rist accepts this type of space-time as funda¬ 
mental in the electromagnetic domain. The 
4-vector language referred to is identical in 
structure with that of 3-vectora, fundamental 
in the Newtonian theory. 

The Gravitational Theory of Nordstrom 
(1912). The question immediately presents 
itself as to how gravitation is to be accounted 
for in such a modified framework of space- 
time. The first suggestion is to take over 
directly the equations written in Part I as 
characteristic of the Newtonian theory. 



136 


THE SCIENTIFIC MONTHLY 


However, such a direct transposition is not 
possible for a reason which can be readily 
specified. 

In the new type of space-time, where the 
natural independent variable is the local time 
appropriate to each particle, and the velocity 
vector has four components instead of three, 
the "length’' of the velocity vector turns out 
to be 1, while the acceleration vector must be 
considered as always being at right angles 
(‘•orthogonal”) to the velocity vector. This 
makes it necessary to insert a very simple 
type of additional terra in the second of the 
equations given earlier in the case of the 
cosmic dust model. 

Furthermore the first equation there writ¬ 
ten, which yields the conservation of mass, 
may be united with the second equation, by 
employing the symmetric velocity bivector 
multiplied by the density p; this product 
may be called the energy bivector and is de¬ 
noted by T. Thus the complete equations 
reduce to only two in the Nordstrom theory. 


div T = p (grad g - (u grad g) u], 
div grad g = 4rtp. 

If we desire to use as model a homogeneous 
adiabatic fluid or gas it is only necessary to 
add a pressure term to the tensor T.* 

The close formal analogy between this rela¬ 
tivistic theory of Nordstrom and the New¬ 
tonian theory of gravitation thus becomes ap¬ 
parent. 

On this basis Nordstrom was able to ac¬ 
count for all of the usual phenomena of 
gravitation. The primary reason why the 
theory was unfavorably received at the time 
was that it would yield a regression instead 
of the desired advance in the perihelion of 
Mercury. The new theory predicted the 
same shift of spectral lines toward the red 
as did the later generalized theory of relativ¬ 
ity of Einstein. 

It is interesting in this connection to quote 
a note added by Nordstrom during the final 
correction of proofs of his article: 

I have learned from a communication by let».*r from 
Professor Einstein that he haa already occupied him- 
•clf with the possibility dealt with here, namely, to 
treat gravitational phenomena in a aimplo way, and 

» That ia, T= (pu'ui-pgU) where the constants gij 
aro zero for t and j different, and p" = l, = 

= g~ = -l. 


that he has como to tho conviction that the conse¬ 
quences of such a theory cannot correspond with tho 
reality. Ho shows through a simple example that 
according to this theory a rotating system in a gravi* 
tational field undergoes a smaller acceleration than a 
non-rotating system. . . . However the consequence 
mentioned indicates that my theory cannot be united 
with the Equivalence Hypothesis of Einstein. . . . 
But although the Einstein hypothesis is remarkably 
ingenious, it introduces great difficulty. ... On that 
account it is desirable to consider gravitation from 
other points of view, and I will permit my communi¬ 
cation to serve as a contribution in this direction. 

Possibly mathematicians, interested in 
natural varieties of important theories rather 
than solely in pursuing the ignis fatuus of 
the "one true theory,” will some day give 
sufficient attention to this simple gravita¬ 
tional theory of Nordstrom’s in order to 
develop the principal properties of the con¬ 
ceptual universe which he has defined. For 
example, the two-body problem might well 
be investigated as it arises in his theory. 

Before passing on, however, it is impor¬ 
tant to note one modification arising in the 
Nordstrom theory. In fact, the last equation 
written, analogous to the Poisson equation in 
the Newtonian theory, is used to determine 
the Nordstrom gravitational potential g as 
a retarded potential. This means that al¬ 
though the theory is reversible from the 
formal point of view, it will not be so in 
actuality. In fact, when time is reversed, 
the retarded potential becomes an advanced 
potential, and hence motions arc not re¬ 
versible. In consequence one is led to expect 
the gradual loss of energy through a kind of 
gravitational radiation. 

The Equivalence Hypothesis and Tensors. 
With reference to a body falling freely in a 
vacuum toward the earth, gravitational phe¬ 
nomena seem to disappear. It was the fun¬ 
damental assumption of Einstein’s which led 
to his general theory of 1916 that a like situa¬ 
tion holds in all gravitational fields, at least 
to a certain extent. This conjecture consti¬ 
tutes bis celebrated Equivalence Hypothesis. 

When the hypothesis is employed in con¬ 
junction with the daring but philosophi¬ 
cally plausible hypothesis which it suggests, 
namely, that space-time is conditioned by 
matter, one is led directly to Einstein's gen¬ 
eral theory of relativity. In this theory 
the geometrical ideas of Rieroann concerning 
generalized spaces and the corresponding 


939 




NEWTONIAN GRAVITATIONAL THEORY 


tensorial language of Ricci and Levi-Civita 
play a fundamental part. 

The Associated Group and its Invariant. 
The associated group is now the extremely 
general one which admits all possible con¬ 
tinuous deformations of space-time. This 
means that any coordinates of reference 
whatsoever may be employed. There has 
been a tendency on the part of Einstein and 
others to give this characteristic feature of 
the new theory of relativity a fundamental 
philosophical significance. But it must be 
remarked in this connection that any entity 
can be expressed in terms of such general 
coordinates. For example, a spherical sur¬ 
face in ordinary space is a very concrete 
mathematical entity, and yet it can be de¬ 
fined intrinsically as a simply connected two- 
dimensional manifold with constant Rieraan- 
nian curvature. In this way we achieve the 
intrinsic definition of a specific surface in 
the general language of tensors. 

The single fundamental invariant is usu¬ 
ally designated by ds and may be called “the 
element of loeal time,” since it has that sig¬ 
nificance from the physical point of view. 
With Ricmann, it is supposed that the square 
of this small element of time, ds 1 , is measured 
by a quadratic expression in terms of the 
small changes of the coordinates. It can be 
proved that by choosing suitable “normal 
coordinates,” the local space-time framework 
in the neighborhood of a point, (an event) in 
the four-dimensional world under consider¬ 
ation, takes the same form as that basic in 
the space-time of the special theory of rela¬ 
tivity, namely that defined by 

ds 7 = dt 1 -dx 7 - dy 1 - dz\ 

Here the light-second is taken as the unit of 
distance. 

In such normal coordinates the Equiva¬ 
lence Hypothesis means that bodies move in 
locally straight lines with locally uniform 
velocity, while light is also propagated 
rectilinearly with the velocity 1. 

It is a fundamental fact that when such 
normal coordinates are employed, the same 
vector notation as before is generally avail¬ 
able for expressing the tensor formulation. 

The Gravitational Theory of Einstein. 
With these considerations in mind it is easy 
to explain the general motivation of Ein¬ 
stein’s theory. 


137 

As in the theory of Nordstrom, it is suffi¬ 
cient to introduce the symmetric energy 
tensor of the second order T where the asso¬ 
ciated matter is conceived of as made up 
either of cosmic dust or of a homogeneous 
adiabatic fluid. The equations of motion 
and the condition for the conservation of 
mass combine into the single tensor equation 
div T = 0. 

Furthermore, if we denote by G the funda¬ 
mental symmetric tensor of the second order 
associated with ds 1 , then the condition that 
space-time is flat where there is no matter 
reduces to div grad 0 = 0. The natural way 
to formulate the condition that space-time is 
curved by matter is to replace 0 on the right- 
hand side of the above equation by the sim¬ 
plest possible expression involving the en¬ 
ergy tensor of matter T. 

The uniquely indicated outcome gives the 
gravitational equations of Einstein; 


div 2* = 0 

div grad 0=-8n (T -\\T\Q) 

These equations are a good deal more com¬ 
plicated in explicit detail than they are con¬ 
ceptually. In fact, the theory introduces 
ten new gravitational potentials instead of 
the single potential of Newton; and when we 
use general coordinates instead of the very 
convenient normal coordinates, the synoptic 
equations above turn out to contain hun¬ 
dreds of terms. 

Successes and Difficulties of Einstein’s 
General Theory. There are two remarkably 
impressive features of the generalized theory 
of Einstein. The first is that it explains the 
observed excessive advance of the perihelion 
of Mercury, which the Newtonian and Nord¬ 
strom theories failed to do, and in addition 
predicts the bending of light from distant 
stars by the sun and a shift of spectral lines 
toward the red, both of which effects have 
been quantitatively verified through subse¬ 
quent observations. The second feature is 
the grandiose suggestion that gravitational 
and possibly all other physical phenomena 
merely amount to a kind of generalized geom¬ 
etry. The theory has proved extremely 
thought-provoking from the point of view of 
physical theory and from the philosophic 
standpoint as well. 


940 




THE SCIENTIFIC MONTHLY 


138 

On the other hand this theory has not 
entered effectively into theoretical physics. 
There has been general agreement in the use 
of the special theory of relativity and its 
framework as basic for electromagnetism, 
and (with Larmor) to think of the general 
theory as a brilliant “auxiliary construct.” 

Certain difficulties of this theory must be 
mentioned. In the first place, the Varia¬ 
tional Principle adduced by Einstein does 
not really provide the complete equations of 
the theory even for empty space, since it is 
necessary to adjoin a second principle which 
expresses the condition that the elementary 
particle travels in a straight line locally, 
namely Zfds = 0. Thus in a strict sense we 
have two Variationul Principles instead of 
one, and this fact seems to destroy the sig¬ 
nificance of the variational property. Of 
course, as was stated in Part I, it does not 
seem reasonable anyway to require such a 
Variation Principle to hold. 

Secondly, in a certain sense, the gravita¬ 
tional forces of Einstein are superimposed 
upon all other forces to the same extent as 
are the Newtonian forces. The mechanism 
of the superposition, however, is entirely dif¬ 
ferent in the two cases: In the Newtonian 
case we simply add on the postulated gravi¬ 
tational forces as further force vectors; in 
the Einstein case we first express physical 
laws in the 4-vector language of the special 
theory of relativity, and then, merely by 
changing 4-vectors and ordinary derivatives 
into tensors-and “covariant derivatives.” we 
automatically iusert the gravitational forces. 

Alternatives to the Einstein Theory. 
There have been many papers written with a 
view to modifying and extending the gravi- 
tntional theory of Einstein. It was hoped at 
the beginning that this theory might provide 
the basis for explaining the apparent rapid 
expansion of the stellar universe. But it was 
later seen by Lemaitre and others that such 
an expansion would only be possible if the 
Einstein field condition (the second equation 
written above) was lightened. Likewise it 
was found by de Sitter and others that the 
four-dimensional framework might be modi¬ 
fied in an interesting way by altering the 
boundary conditions. Such changes leave 
the Einstein theory substantially uualtered. 

Among attempts to incorporate electro¬ 


magnetic phenomena as well as gravitational 
phenomena in a “unitary field theory” we 
may especially mention the “gauge-invariant 
geometry” of Weyl (1919) and Kaluza’a 
five-dimensional theory (1921). In the first 
of these the quantitative significance of local 
time ds is abandoned, although ds = 0 con¬ 
tinues to represent an invariantive condition. 
Bergmann’s recent book says of the Weyl 
theory that, "in spite of the beauty of the 
geometrical conception, this geometry has 
not led to a successful theory.”* Likewise 
it is evident that Kaluza's five-dimensional 
theory and its projective equivalents (Veb- 
len and Hoffman. Pauli), add a fifth dimen¬ 
sion “which has no direct physical signifi¬ 
cance.” Here ingenious ad hoc hypotheses 
concerning the fifth dimension yield four 
new world functions which operate as the 
basic electromagnetic vector potential. 

In this way, a number of different and 
interesting further leads have been sug¬ 
gested, based on curved space-time. All of 
them purport to present a geometrical view 
in which matter appears as a local singular¬ 
ity in a geometric manifold. The technical 
difficulties to be overcome in their further 
development seem to me extremely great. 
There is an undeniable air of unreality about 
them as well as in Einstein's generalized 
gravitational theory. 

Some Other Theories. There have been 
attempts to obtain a simpler explanation 
of the crucial phenomena referred to above. 
For example, in this country II. B. Phillips 
(1920) tried to deduce the some conclusions 
from the Newtonian point of view. His sim¬ 
ple aud ingenious theory has since been re¬ 
discovered more than once. Phillips accepts 
the somewhat vague Principle of Equiva¬ 
lence of Einstein and shows that, by inter¬ 
preting it conveniently in the space and time 
of classical physics, it is possible to arrive at 
Schwarzchild's basic formula, upon which 
alone all the crucial tests are known to be 
based. Unfortunately, it would be hard to 
deny that some of the reasoning is ad hoc, 
being definitely directed towards obtaining 
this very formula. 

Others have tried to find a basis in the 

* An Introduction to the Central Theory of Bela- 
lirily by P. G. B.rgmann. with a Foreword by A. 
Einstein (New York, 1942). 


941 



NEWTONIAN GRAVITATIONAL THEORY 


139 


electromagnetic space-time of special rela¬ 
tivity. A conspicuous illustration has been 
the attempt of the English astronomer Milne 
(1933).* To me his gravitational develop¬ 
ments seem lacking in cogency; E. A. Milne 
starts with a homogeneous expanding uni¬ 
verse based on special relativity, and it is 
hard to see how any definite theory of gravi¬ 
tation results from the inherent properties of 
his model. 

In neither the theory of Phillips nor of 
Milne does there appear a natural analogue 
of the fundamental Poisson equation char¬ 
acteristic of the Newtonian theory. 

The Relativistic “Perfect Fluid.” The 
idea of the “perfect fluid” was introduced by 
me in 1928 but with no thought of using it 
except against the framework of the general¬ 
ized theory of relativity of Einstein. My in¬ 
terest in the perfect fluid arose from the fact 
that it was a satisfactory form of matter 
from the mathematical point of view, and 
appeared to afford a possibility for a con¬ 
ceptual derivation of Schrodinger’s cele¬ 
brated “wave equation” of 1927. 

It was only in 1942 that it occurred to me 
to try to construct a gravitational theory in 
which the perfect fluid was used against the 
background of the flat space-time of the spe¬ 
cial theory of relativity. To my surprise I 
found that the simplest possible theory of 
this type led to results in complete accord 
with known gravitational phenomena. It 
was this new theory which I presented at the 
Astrophysical Congress in Mexico in 1942. 
Before proceeding with the consideration of 
this theory, it is desirable to make clear why 
the perfect fluid deserves our special atten¬ 
tion. 

To begin with, let us examine from the 
mathematical point of view, what type of 
fluid is worthy of being a model for matter. 
The model of cosmic dust, used in the Ein¬ 
stein theory, appears invalid inasmuch as 
such matter can interpenetrate freely, con¬ 
dense on a point, and turn inside out, in the 
natural course of events! 

On the other hand, if we consider forms of 
homogeneous adiabatic fluid in which there is 
pressure, we avoid these difficulties, but we 
run into another difficulty which is equally 

* 8ee bis Relativity, Gravitation and World Struc¬ 
ture, 1936. 


serious. Such fluid will have a definite dis¬ 
turbance velocity at all densities. If this 
disturbance velocity is greater than that of 
light, there is evidently a fundamental diffi¬ 
culty, since it is a basic presupposition that 
the velocity of light is a limiting velocity. 

If this velocity is less than that of light, and 
if two portions of the fluids collide at op¬ 
positely directed velocities exceeding this dis¬ 
turbance velocity, then the equations break 
down. Thus it seems essential to demand 
that at all densities, the disturbance velocity 
must be exactly equal to that of light. When 
we do so, there is obtained the “perfect 
fluid” with “equation of state” p = i P . For 
reasons which I cannot enter upon here, it 
seems certain that no difficulties can ever 
arise with interacting and colliding portions 
of perfect fluid, moving in a relativistic 
framework. 

Such a fluid will be almost incompressible, 
due to the enormous disturbance velocity, 
and its mass will not be strictly invariable. 
It is impossible therefore to criticize its 
structure on account of any thermodynamic 
argument based upon the strict conservation 
of mass. As has been kindly remarked in 
Buenos Aires by my eminent colleague, Pro¬ 
fessor Enrique Butty, the perfect fluid has 
a certain analogy with the well-known “lu¬ 
miniferous ether” of Young and Fresnel in 
which light travels in all directions with the 
same velocity. There is, however, the basic 
difference that while the luminiferous ether 
is essentially static, the perfect fluid is con¬ 
ceived of as only filling part of electromag¬ 
netic space-time, and as mobile and dynamic. 

The New Theory. My theory, like Nord¬ 
strom’s, is built upon the electromagnetic 
frame-work of the special theory of rela¬ 
tivity. I start from the perfect fluid as a 
model and introduce a symmetric gravita¬ 
tional potential H (a tensor), letting T 
designate as before the symmetric energy 
tensor of the perfect fluid. 

The equations of motion of the fluid are 
of course written div T = f, where / is a force 
4-vector which has to be specified. The nat¬ 
ural requirement, by analogy with the New¬ 
tonian theory, is that the 4-vector f should be 
linear in the “first partial derivatives” of H. 
Furthermore, / must be automatically or¬ 
thogonal to the velocity vector u, since, as 


942 



140 


THE SCIENTIFIC MONTHLY 


bas been noted above, the force and accel¬ 
eration vectors arc always orthogonal. Fin¬ 
ally, it is natural to require that the gravi¬ 
tational theory shall be formally reversible 
in character, as it has been in all previous 
theories. The simplest form for f which 
meets these requirements is then the follow¬ 
ing :* 

pu curl U u 

Thus the two equations of the "perfect 
fluid" theory are obtained: 


div T = pu curl II u 
div grad II = 8n T 


The "perfect fluid" theory of gravitation 
embodied in the above equations not only 
yields an explanation of gravitation in its 
first order effects, but also leads to essentially 
the sume conclusions os Einstein’s regarding 
the three crucial second order effects! It 
seems to me to deserve careful consideration. 

There is no difficulty in incorporating the 
electromagnetic as well as gravitational 
forces in the new theory. Of course the elec¬ 
trical charge is thought of as invariably at¬ 
tached to the perfect fluid. The form of the 
electromagnetic force which has to be added 
to the gravitational force specified in the first 
equation of the "perfect fluid" theory is of 
interest, because of its structure. The more 
important electromagnetic force vector is 
known to be linear and homogeneous in the 
velocity vector u while the gravitational force 
vector is quadratic in the above theory. This 
is fitting, in view of the fact that gravita¬ 
tional forces are much smaller. One recalls 
in this connection a daring speculation of Sir 
Joseph Larmor’s in a section entitled "Are 
the linear equations of the Aether exact t" of 
his important work "Aether and Matter" 
(1900), where he asks: "Why then should 
not relatively minor phenomena like gravita¬ 
tion be involved in similar non-linear terms 
in . . . the analytical specification of the 
free aether. . .1" 

It is to be emphasized that the new theory 

«This notation is not quite complete, since a 
triadic vector, curl U, is involved. The specific defi¬ 
nition is 



is strictly based upon electromagnetic space- 
time and is formulated completely in the lan¬ 
guage of 4-vectors. The development of the 
theory and its application to the crucial phe¬ 
nomena is an elementary and simple matter 
requiring only four or five pages of routine 
mathematical work. 

Whether or not these equations can be ob¬ 
tained out of a simple, unified variational 
principle remains to be seen. I have not as 
yet had an opportunity to investigate this 
interesting question. Likewise I have not as 
yet found time to examine whether or not the 
theory suggests further Mpcriments, by the 
aid of which it anu the Einstein theory of 
1916 might be compared. 

Concluding Remarks. My primary pur¬ 
pose has been to convey some idea of the in¬ 
timate formal relation between the grandiose 
theory of universal gravitation of Newton 
and certain recently proposed relativistic 
modifications. It seems clear that the New¬ 
tonian theory will always stand as the real¬ 
istic basis for astronomical calculations. 
Relativistic theories are likely to be used 
only in a few cases when large velocities enter 
and the minute relativistic effects can be ob¬ 
served. 

Nevertheless such relativistic theories seem 
more in accord with the electromagnetic 
structure of matter than does the theory of 
Newton. These new theories deserve much 
more serious attention than they have re¬ 
ceived. both from theoretical physicists and 
mathematicians, for they are as yet in a 
highly incomplete state. For example, it is 
not even known whether a real analogue of 
the two-body problem exists in any of them. 
Furthermore, not one of them is really a field 
theory in the complete sense of the classical 
electromagnetic theory of empty space, siucc 
all differentiate between the parts of space¬ 
time where matter is and void space-time. 

One cannot but feel the highest admira¬ 
tion for the solid and permanent accomplish¬ 
ments of the gravitational theory of Newton, 
and for the splendid developments of classi¬ 
cal physics which it inspired. It would be 
intensely interesting to know how Newton 
himself would regard the relativistic variants 
of his theory which have been suggested by 
modern developments in electromagnetism 
and modern mathematical formalism. 


943 




Reprinted from Boletin de la Sociedad Mathematica Mexicana , 
Vol. 1, mos. 4-5, July-Oct. 1944, pp. 1-23. 


EL CONCEPTO MATEMATICO DE TIEMPO 
Y LA GRAVITACION 


Por el Dr. George D. Birkhoff, de la 
Universidad de Harvard. * 


PARTE I 

D1SCUSION GENERAL DEL CONCEPTO DF. TIEMPO 

1. El ticmpo absoluto 

No es dificil explicar el tiempo (absoluto) del sentido comun desdc 
el punto de vista genetico. Hasta el descubrimiento de la finitud de la 
velocidad de la luz (Romcr. 1675). se consideraba como natural que los 
acontecimientos sucedan en el instante en que se los ve. Este juicio casi 
intuitivo fue aplicado de la misma manera a los sucesos estelares y a los 
que tienen lugar sobre la tierra. Los fisicos y astronomos trataron de dar 
una description completa de los fenomenos y sus leyes, empleando la idea 
adicional de un "espacio absoluto” de tres dimensiones (el de Euclides). 
A si se proveyeron los fisicos y astronomos de un tablado en el que todo 
acontecimiento podia desempenar su papel. Con estos instrumentos de 
pensar se obtuvo el exito magnifico de la epoca clasica en las ciencias fi- 
sicas. 

Ni a Newton, ni a sus sucesores les parecio que el descubrimiento de 
la finitud de la velocidad de la luz afectara la validez de la idea fisica de un 

* Estc trabajo fue presentado per su autor al Congreso International de Astro- 
tisica. efcctuado en Tonanzintla. Puebla, en febrero de 1942. 


944 



2 


[Julio y Octubre 

tiempo absolute t. Dijo Newton en el primer scholium de sus celebres 
“Principia": 

“I. El tiempo absoluto, verdadero y matematico, por si mismo y por 
naturaleza propia, fluye regularmente, sin consideracion a ninguna cosa 
externa y por otro nombre se llama duracion; el tiempo relativo, aparente 
y comun, consiste en una medida sensitiva y externa de tiempo (mas o 
menos) por el movimiento 

Este “tiempo relativo" de Newton corresponde al tiempo cualitativo, 
mencionado abajo. 

Ahora nos es claro que la nocion intuitiva "tiempo absoluto perdio 
su fuerza primitiva en el momento en que la observacion de Romer sobre 
los satelites de Jupiter probo la finitud de la velocidad de la luz. Pero el 
mundo cientifico contemporaneo de Newton y sus sucesores continuaba 
siguiendo ciegamente al gran matematico y no se dio cuenta de la verda- 
dera situacion. Solamcnte su famoso rival, el filosofo y matematico Leib¬ 
niz rehuso aceptar este concepto y afirmo que el tiempo es una entidad 
relativa. Dijo Leibniz: “Tengo tanto al espacio como al tiempo como 
cosas puramente relativas. El espacio es un orden de existences de la 
misma manera que el tiempo es un orden de sucesiones." Demostro dicha 
relatividad de tiempo (o de espacio) empleando su "principio de la razon 
suficiente". De acuerdo con este principio Dios no se serviria del tiempo 
absoluto porque siempre elige unicamente lo mejor; y en el tiempo abso¬ 
luto no hay ninguna diferencia entre los distintos instantes. 

La mecanica clasica de Galileo y Newton empezo a modificarse cuan- 
do Faraday hizo sus descubrimientos electromagneticos. Faraday concibio 
en todo el espacio fuerzas electricas y magneticas con interrelacioncs inti- 
mas. Para su comprension invento el eter electromagnctico o espacio en 
que se desarrollan estas fuerzas. Micntras que el espacio absoluto de New¬ 
ton admite una velocidad de movimiento de traslacion uniforme, el eter 
de Faraday no la admite. Desgraciadamente, experimentos muy cuidado- 
sos de Michel son y Morley (1887) para determinar la velocidad de la 
tierra en el eter condujeron a resultados negativos. En efecto: i mostraron 
que esta velocidad es siempre nula! 

De los resultados inesperados de Michelson y Morley siguio inevita- 
blemente un ulterior desarrollo de las nociones fisicas de tiempo hacia las 
de la relatividad especial de Lorentz y Einstein. Pero antes de considerar- 
las, examinemos un poco mas en detalle la naturaleza abstracta del tiempo 
absoluto, y el universo ideal correspondiente y su historia peculiar. 


945 



de 1944) 


3 


2. La abstraction de tiempo absoluto 

Se puede caracterizar cualitativamente el concepto de tiempo absolu¬ 
to con los siete postulados siguientes: 

(1) Existe una clase E de elementos (“instantes”) : A, B, C. ... 

(2) Si A < B (lease: A precede a B), entonces A =£ B. 

(3) Si A^B,oA<BoB<A 

(4) Si A^B,A<ByB<Aes imposible. 1 

(5) Si A < B y B < C, entonces A < C. 

(6) Existe para cualesquicra A, B (A < B) un C tat que A < C < B. 

(7) Para toda snccsion doblc: 

Ai < A 2 ‘< A 3 .< . B 3 < B 2 < Bi, los elemen¬ 

tos X tales que At < X < Bj (i, j, cualesquiera) constituyen un inter- 
valo. 2 

Xotemos que el ultimo postulado afirma que una serie de intervalos 
(A,„ Bn) tales que (A n , B n ) incluye (A„_i, B n _i) en su interior, (n = 

1, 2...), tiende siempre a un intervalo limite. 

No es dificil demostrar que un sistema de elementos A, B..., con 
una relacion <, que obedezea a estos postulados, puede representarse por 
los puntos de una linea recta infinita, de manera que el elemento A pre¬ 
cede a otro B, cuando el punto Pa correspondiente a A precede a P B en 
aquella linea orientada en sentido definido. 

Los siete postulados parecen ser muy naturales, y evidentemente sir- 
ven para definir el tiempo absoluto desde cl punto de vista cualitativo. S« 
anadimos otro postulado, igualmentfc natural desde el punto de vista fisico: 

(8) Todas las partes de esta linea de tiempo son identicas en sus pro- 
piedades jisicas, 

concluiremos que se puede definir un numero real t como coordenada en 
esta linea que mide la distancia temporal entre dos sucesos A y B, esto es: 
tB — t A , de manera que si A < B, entonces t A < t B , e inversamente. 

1 El postulado (4) no es independiente de (1). (3) y (5). 

2 Es decir. un solo elemento si A = X = B o una clase de elementos. para 

A * B. 


946 





4 


| Julio y Octubre 


Se puede llamar a (8) el postulado de isotropismo del tiempo. 

Nuestra formulacion del postulado (8) no es muy exacta. Para darle 
una forma matematica mas aceptable es conveniente introducir la idea de 
un automorfismo fisico del sistema de elementos y relacion considerado 
como una correspondencia biunivoca que conserva la relacion entre lo* 
elementos y toda otra propiedad fisica. Entonces (8) podria formularse 
de la manera siguiente: Hay un automorfismo y uno solo que hace cor res¬ 
ponder un elemento arbitrario A con otro elemento arbitrario B. Eviden- 
temente, para t'ijar el tiempo como entidad fisica es solamente necesario 
un instante arbitrario que sirva de origen. 

3. li! miivcrso corrcspondicntc y sit liistoria 

El universo ideal de Newton, basado en las ideas de tiempo absoluto 
y de cuerpo rigido. nos da una primera aproximacion notable de nuestro 
universo actual. A1 astronomo. este universo ideal le parece sobre todo 
regido por la ley de gravitacion universal de Newton. Si se restringe la 
atencion al caso sencillo de n particulas (puntos materials) de masas 
m,. m...... m„, podemos describir en sus rasgos generales la historia com- 

pleta (futuro y pasado) del sistema. 

Ciertas consideraciones matematicas debidas a Sundinan en el caso 
n = 3 muestran que. por lo general, las masas no chocan nunca. Pero si 
dos chocan. se separan inmediatamente en una manera determinada (con- 
tinuacion analitica despues del clioque). Para no complicar demasiado la 
situacion. ignoremos los casos. muy raros. en que tiene lugar un clioque 
entre mas de dos masas. Debo hacer notar que aun siendo tales resultados 
muy sencillos y elementales no estan coinplctamente demostrados por los 
matematicos. 

En el caso particular de ties masas (n = 3). con fuerzas fijas angu- 
lares de impulsion alrededor del centro de gravedad no todas nitlas, de- 
mostro Sundman de manera rigurosa que las tres masas no pueden chocar 
en ningun instante. Asi se puede continual* el movimiento para 

— *c < t < -f x. 

sin mas que choques dobles en numero finite en todo intenalo finito. 
Y yo demostre con sus metodos que las tres masas Pi. P-. Pa no pueden 
acercarse entre si sin que en el triangulo P1P2P3. uno de los lados 
P1P2 permanezea siempre pequeno relativamente a los otros dos lados P1P3. 


947 



de 1944] 


5 


P 2 P 3 ; y que la suma PiP 2 + P 2 P 3 + P 3 P« tenderia hacia el infinito en 
ambas direcciones del tiempo. De nianera semejante parece muy probable 
que en tal si sterna, puesto que el acercamiento de dos niasas proporciona 
una cantidad grande de energia cinetica, en general las tres masas se 
separaran a la larga, aisladas todas, o una aislada y las restantes en un par, 
y retrocederan del centro de masa hacia el infinito con velocidades unifor- 
mes en el limite. Como el sistema de tres masas es reversible, la misma 
conclusion permanece valida para el pasado. 

En el caso de n(>3) masas, semejantes consideraciones nos condu- 
cen a conclusiones analogas: En el futuro y en el pasado las n masas re¬ 
trocederan, o aisladas o en pares y se alejaran hacia el infinito con velo¬ 
cidades uniformes. Naturalmente, pueden existir casos excepcionales, como 
el del movimicnto periodico. Aunque nuestro conocimiento matematico 
avance muy rapidamente en la direccion deseada, los problemas son difi- 
ciles y todavia no resueltos. 

De aquellos resultados y conjeturas, vemos que en cierta configura- 
cion intermedia, las n masas deben estar lo mas cerca posihle entre si, y 
se separan, en la manera indicada. de esta posicion en las dos direcciones 
de tiempo. Asi, este modelo abstracto, el mas sencillo. presenta un "uni- 
verso en expansion" tanto en el pasado como en el futuro. Y comparando 
los hechos observados con estas conclusiones teoricas. concluimos que 
nuestro universo esta en el segundo estado de su existencia. 

Evidentemente, el universo actual es enormemente mas complejo que 
nuestro modelo de n masas en el espacio y tiempo de Newton. Natural¬ 
mente los astronomos siempre corrigieron las observaciones a causa de 
la velocidad finita de la luz. Empleando estos instrumentos e introdu- 
ciendo los cuerpos rigidos. los fluidos. etc., lograron explicar muchos y 
variados fenomenos. 


4. /;/ firmpo local v sits poshtlados 

Los resultados cxperimentales de Michelson y Morlev nos indican, 
ni mas ni meno.s. que en la naturaleza. el espacio y el tiempo admiten un tipo 
nuevo de relatividad no sospechado. el de la relatividad restringida (o 
especial) de Einstein (1905). Desde el punto de vista del astronomo, 
con su reloj y su telescopio. el modelo inmediato y sencillo que presenta 
el universo de las estrellas es uno en el que todas estas juegan un papel en 
apariencia identico. De acuerdo con este modelo. parecc muy razonable 


948 



6 


| Julio y Octubrc 


ciue haya una relatividad en cl espacio-tiempo correspondiente que no per- 
mita distinguirlas cntre si como objetos fisicos; y esta relatividad explica 
los experimentos de Michelson y Morley. Ademas el matematico Min¬ 
kowski (1908) vio que las ecuaciones fundamentals del campo electro¬ 
magnetic© estan de acuerdo con la nueva relatividad. 

Los postulados del "tiempo local", ideados ,x>r Lorentz y que estan 
incluidos en la relatividad restringida. se pueden formular del modo abre- 
viado siguiente: 

(1) En coordenadas confonnes relativas t.x. y. z de una particula 
(;o estrella!), (t mide cl tiempo local; x. y. z miden su espacio propio) 
toda particula y todo rayo de luz se mueven en lineas rectas con vcloci- 
dades uniformes. 

(2) La velocidad de la luz en el espacio-tiempo relativo es siemprc 
una constante c. que tomamos por 1. puesto que el segundo-luz (186.300 
millas) es la unidad de distancia. La velocidad de una particula cual- 
quiera es menor que c = 1. 

(3) Las marchas de relojes identicos que miden los tiempos locales s. 
con respecto a una particula cualquiera. no varian y dependen solo dc la 
velocidad relativa de la particula observada. Con tal que accptemos esos 
postulados, obtenemos sin dificultad la formula fundamental de la tcoria: 

ds 2 = dt 2 — dx* — dy- — dz 2 

en donde ds es el elemento de tiempo local de la particula entre dos suce- 
sos proximos con coordenadas t. x. y, z que difieren en dt. dx. dy. dz. res- 

pectivamente. 

Pero en tal universo astronomico el concepto de la simultaneidad ab- 
soluta. que implica la nocion de tiempo absoluto, pierde toda su fuerza. De 
la misma manera desaparece el concepto de cuerpo rigido que fue un ms- 
trumento teorico fundamental en la fisica clasica. En verdad, i aun liasta 
hoy no hemos podido hallar ningun instrumento que pueda reemplazarlo 
en la fisica moderna! Por ejemplo, en la fisica clasica era lcgitnno imaginar 
una esfera rigida cargada de electricidad con intensidad dada. Ningun con¬ 
cepto analogo se Italia en la relatividad especial aunque de heclto nos rodean 
cuerpos mas o menos rigidos. 

Notemos que en un universo tal se puede defimr un tipo de tiempo 
cosmico. Explicate el metodo general para proceder solamente en el caso 
mas sencillo en que n particulas de ntasas naturales ..m„ se 


949 



de 19441 


7 


mueven en trayectorias rectilineas con velocidades uniformes. Es facil 
demostrar que en este caso existe una unica particula ideal I, que es el 
centro de gravedad exacto del sistema. Evidentemente se define asi un 
“tiempo cosmico” unico y conveniente. 

En el fondo, el espacio-tiempo de la relatividad especial no presenta 
ningun inisterio y forma un sistema consistente y complcto. 

Ahora este tipo de relatividad es aceptado por todos los que trabajan 
en el dominio del electromagnetismo. Desgraciadamente no existe hasta 
ahora en esta teoria ninguna explicacion de los fenomenos de gravitacion 
que este de acuerdo con ella. 

En la segunda parte de este articulo presentare un ensayo para hacerlo. 
Sin embargo, la teoria de la relatividad restringida queda incompleta aun 
siendo aceptada nuestra teoria. Pero ciertamente no seria mas incompleta 
que la teoria generalizada de Einstein (1915) % invcntada solamente para 
explicar los fenomenos de gravitacion. No seria dificil definir un “tiempo 
cosmico” apropiado a la nueva teoria de gravitacion, al menos cuando la 
masa y energia total son finitas. 

5. Historia de un sistema con relatividad restringida 

i Como es el universo que corresponde al espacio-tiempo de la rclati- 
vidad restringida? Considerando muy brevemente esta cuestion aceptare- 
mos la teoria de la gravitacion de la parte II de este articulo. 

En el caso mas sencillo el universo contendra n masas (puntos) de 
magnitudes mi, m 2 , m„. Para velocidades relativas moderadas, se 

comportaran las masas en primera aproximacion como si estuvieran de 
acuerdo con la ley de gravitacion de Newton. Entonces esperariamos que 
el desarrollo de nuestro universo no difiriera mucho de lo predicho en la 
teoria clasica. Pero con respecto al centro de gravedad I y su tiempo cos¬ 
mico. los relojes locales de las particulas (estrellas) en retroceso parece- 
rian andar mas despacio que el del centro de masa I. 

Pero el sistema no seria completamente reversible en el tiempo puesto 
que los potenciales gravitacionales de dicha teoria envuelven la determina- 
cion de ciertos potenciales retardados. Aunque no lo he demostrado, creo 
que existe una disminucion muy lenta en la energia; en lo que sigue voy 
a hacer esta hipotesis. aunque no este demostrada. 

Por razones explicadas mas tarde, yo he empleado en la teoria pro- 
puesta un “fluido perfecto" en lugar del “polvo frio” o materia de estruc- 


950 




8 


[Julio y Octubre 


tura indeterminada que entra en la teoria de Einstein. Con cuerpos, casi 
esfericos, formados de un tal fluido i>erfecto y que se mueven con pequenas 
velocidades relativas. se obtiene un buen modelo del sistema celeste desdJ 
el punto de vista gravitacional. En verdad, no solo sera valida la ley de 
Newton, sino que las desviaciones, predichas por la teoria de Einstein y 
verificadas por las observaciones, son igualmente predichas por la nueva 

En el universo ideal me parece probable que con velocidades relativas 
suficientes de las masas, estas tenderan a separarse. aisladas o en pares; 
pero me parece tambien que a la larga los pares chocaran. en cl caso de 
masas fluidas. o se aproximaran indefinidamente. 

Puesto que tales sistcmas no son reversibles. no podemos inferir su 
historia pasada. Es claro que las posibilidades son muchas y varias. 

i Un sastre apto prefiere siempre hacer un traje de una sola pieza de 
paho! No obstante, despues de la aparicion de la relatividad restringida. 
todos los fisicos ban hecho al universo fisico un traje teorico, compuesto 
de las ideas de la relatividad restringida (y aun generalizada) y tambien de 
las ideas contradictorias de la fisica clasica (como las de cuerpos rigidos. 
fluidos ordinarios. etc.). 

Sin embargo, no cabe duda de que los antiguos conceptos de espacio 
y tiempo no reapareceran nunca. Como dijo Minkowski en una frase muy 
bicn conocida: "De ahora en adelante, el espacio solo v el tiempo aislada- 
mente no desempenaran otros paries que los de sombras, y solamcnte 
una combinacion de los dos poseera una significacion actual." 

6 . Tiempo gencralizado dc Einstein 

Las ideas de Minkowski sobre las ecuaciones electro-magneticas en 
la relatividad restringida (1908) manifestaron claramentc la importancia 
central de la formula que da ds 2 en esta teoria. Se hizo evidente que el 
cspacio-tiempo correspondiente no es otra cosa que una geometria especial 
del tipo considerado por el matematico Riemann liace mas de un siglo. 
Fue Einstein el que reemplazo el elemento ds 2 por un elemento mas com- 
plejo: 

ds 2 = gu dx‘ dx J . 

(Notacion del calculo diferencial absoluto dc Ricci y Lcvi-Civita.) Las 
glJ (i, j = 1.4) con gu =gji son los "potencialcs gravitacionales". Des- 


951 



de 1944] 


9 


de su punto de vista las trayectorias de las particulas son las curvas geo- 
dcsieas tales que ft j* ds = 0 a lo largo de ellas, con ds 2 > 0; los rayos de 
luz son las curvas geodesicas tales que ds = 0. Es evidente que en la 
teoria restringida esta interpretacion es valida. 

Procediendo asi, Einstein introdujo diez nuevas funciones desconoci- 
das gij, mientras que en la teoria de Newton no hay sino una unica fun- 
cion g. Para determinar estas funciones Einstein supuso que las condi- 
ciones semejantes a 

R,, —*5 Rg„ = Ti, 


son validas. (T,j designa “el tensor de energia de materia", y R„ es “el 
tensor contraido de la curvatura".) 

Con tales hipotesis se obtuvo la ley observada de gravitacion como 
priniera aproximacion. y se propusieron tres resultados derivados de su teo¬ 
ria. La prediccion del avance del perihelio de Mercurio estuvo de acuerdo 
con las observaciones. La de la curvatura de la luz por el Sol, fue verifi- 
cada experimentahnente mas tarde. La tercera fue tambicn verificada; pero 
una verificacion completamente satisfactoria es muy dificil. 

Apartc de la prediccion de estos tres efectos muy pequenos, la teoria 
generalizada de la relatividad de Einstein no posee ninguna otra justifi¬ 
cation que la de dar una teoria elegante de la gravitacion, que falta en la 
teoria restringida. Desgraciadamente, para hacerlo Einstein tuvo que reem- 
plazar el espacio-tiempo sencillo y natural de su teoria especial por un 
tipo de espacio-tiempo enormemente mas complejo. 

Aunque nuestra teoria retiene el espacio-tiempo de la relatividad es¬ 
pecial y es mas sencilla, no obstante esta igualmente de acuerdo con los 
hechos observados. 

Ciertamente es importante tener una variedad de teorias posibles. En 
consecuencia, presento aqui una nueva teoria de gravitacion, sin preferir 
esta o la otra definitivamente. En verdad, lo que es mas probable es que 
otras teorias reemplazaran algun dia a la mayor parte de las teorias 
de hoy. 

La genial intuicion con que Einstein comenzo la teoria general de la 
relatividad es la siguiente: La materia debe afectar la estructura del espa¬ 
cio-tiempo. Pero existen muchas intuiciones muy naturales y no todas 
pueden ser verdaderas al mismo tiempo. 

No tratare de considerar el universo correspondiente a la segunda 
teoria relativista de Einstein, y su desarrollo historico. en ningun caso. La 


952 



10 


[Julio y Octubre 


teoria matematica de las ecuaciones diferenciales correspondientes es una 
verdadera terra incognita en todas direcciones. Pcro me atrevo a conje- 
turar que si tales ecuaciones definen un universo matematico con consis- 
tencia logica, sera posible otra vez definir un tiempo cosmico, al menoa 
en el caso de masa y energia finitas. Conjeturo tambien que se podra 
seguir la historia futura y pasada de ese universo durante un largo in- 
tervalo de tiempo. Fuera de un tal intervalo, creo que no sera posible 
inferir lo que acontece. hasta el momento en que sean descubiertas las ver- 
daderas leyes de la estructura ultima de la materia. 

7. Tiempo en la jisica cuantica 

No podremos hacer mas que notar la falta de un verdadero tiempo 
local en la fisica cuantica. Solamente podemos definir un tiempo consti- 
tuido por la agrupacion de intervalos finitos. 

En 1926. pocos nieses despues del descubrimiento de la ecuacion fa- 
mosa de Schrodinger, publique un articulo en que trate de utilizar las 
teorias relativistas de Einstein para obtener la ecuacion de Schrodinger. 
Si se lograsc exito en tal empresa, seria posible retener la idea de tiempo 
local. Y si fuese desarrollada la teoria de gravitacion que doy aqui o 
modificada en una direccion constructiva. seria aun posible retener cl es- 
pacio-ticmpo de la relatividad restringida. 

Me parcce que tales posibilidades merecen nuestra cuidadosa atencion. 

Proccdere a formular en detalle dicha teoria de la gravitacion. 

PARTE II 

TEORIA DE LA GRAVITACION 

1. El jluido perfccto 

El elemento de tiempo local de la teoria de la relatividad restringida 
se define del modo siguiente: 

(1) ds 2 = dt 2 - dx 2 - dv 2 - dz 2 = 6 ab dx° dx b . 

en donde 

x 1 = t. x 2 = x. x 3 = y. x 4 = z; 


953 



de 1944] 


11 


5 ,j ^ 0, cuando i ¥* j y &n = 1» ^22 — &33 — b 44 — — 1. Usaremos en es- 
te articulo la notacion ordinaria del calculo diferencial absoluto de Ricci 
y de Levi-Civita. 1 

Imaginemos un fluido que se encuentra en el espacio fisico a gran dis- 
tancia de cualquier masa, de manera que se puedan despreciar las fuerzas 
gravitacionales. 

Postulemos como ecuaciones del fluido las siguientes: 


(2) 

en donde 
(3) 


0T ,a 





T ,J = pu* u j — pb u 


y f 1 denota el tensor de la fuerza por unidad de volumen. Aqui p designa 
la densidad y u* el tensor de la velocidad: 


dx 1 

ds 


Una consecuencia inmediata de (1) es: 

(4) 6 ab u a u b =l. 

Para el sistema de referencia en reposo: u 1 = 1, u 2 = u 3 = u 4 = 0. 
Las ecuaciones (2) se reducen, en este caso, a: 


( 2 ') 


d 

0u 2 


(Q — P)+Q 
dp 


( 0 u 2 dn 3 dn 4 \ _ 

dx 2 £x 3 0xV 


0 


Q 




3x l 

3u 3 

dx 1 

du 4 

dx 1 


0x 2 

dp 

9x 3 

dp 

d* 4 


+ * 2 - 


+ < 3 - 


+ * 4 - 


1 Vease. por ejemplo. mi libro "Relativity and Modern Physics'*. Second 
Edition. 1927. 


954 



12 


[Julio y Octubre 


De la primera de las ecuaciones (2') se concluye inmediatamente que 
la integral 


(5) 



dx- dx a dx\ 


calculada para las coordenadas de reposo, y extendida a un volumen cual- 
quiera del fluido, es invariante en el movimiento del fluido. 

Si se postula para el fluido una ecuacion arbitraria de estado 


(6) • P = f (0). 

se presentan dificultades esenciales. 

El polvo frio se puede considerar coino un fluido en que p = f(p)=0. 
Iniaginemos una nubc de polvo de densidad constante en el instante 
t = 0, y supongamos que todas las particulas de ese polvo se mueven ha- 
cia el origen con velocidades constantes y proporcion£les a sus distancias 
a aquel en t = 0. Toda particula de polvo llegara al origen en el inismo 
•instante t; de manera que se tendria en este caso una concentration dc 
masa finita en un punto. 

Cualquier hipotesis distinta de p = A Q conduce a una dificultad fun¬ 
damental. Consideremos el caso de un fluido de densidad uniforme Q 0 y 
que se encuentre en estado de reposo. La velocidad de la onda de pertur- 
bacion es entonces: 

f'(Qo) 

v„ = -. 

1 - f'(Qo) 

Supongamos que f'(p„) ¥= \ \ entonces v 0 * 1. No esta de acuerdo con el 
espiritu de la relatividad restringida el que la velocidad v« sea mayor que 
uno. Podemos suponcr entonces que v« es menor que uno. Iniaginemos 
dos porciones de fluido perfecto, simetricas con respecto a un piano, que 
se acercan con velocidades perpendiculares a aquel, y cuya rapidez de 
acercamiento v es mayor que v D y menor que 1. En el instante del choque 
la onda de perturbacion tiene una velocidad menor que la de la masa fluida 
que la provoca. Las ecuaciones del movimiento fallan en el caso del 
choque de porciones de fluido. Nos vemos obligados a suponer v„= 1, y 
de esto se deduce inmediatamente p = A Q. ignorando la constante de in¬ 
tegration. que no interviene en las ecuaciones del movimiento. Supondre- 


955 



de 1944] 


13 


mos que en el caso de equilibrio: q = g 0 > 0. Con esta ecuacion de estado, 
la integral (5) se transforma en 


(S') 


Iff* 


dx 2 dx 3 dx 4 . 


Por consiguiente, vemos que el fluido perfecto no tiene masa constante. 
Lo que se conserva en el movimiento del fluido perfecto es una masa fic- 
ticia cuya densidad es y/o. Debido a la enorme velocidad de la onda de 
perturbacion, para velocidades moderadas el fluido perfecto es casi incom- 
presible. 

Supongamos que una porcion esferica de fluido perfecto se mueve 
lentamente en un sistema de referenda inercial. Si una fuerza moderada, 
cuya direccion no cambia rapidamente, urge a este cuerpo, entonces, el mo¬ 
vimiento se efectua en primera aproximacion como el de un fluido incom- 
presible en la fisica clasica. Hn efecto, podemos escribir las ecuaciones 
del movimiento para i = 2, 3, 4 en la forma: 



La cantidad entre parentesis es muy pequena porque el fluido es casi in- 
compresible, de manera que las ecuaciones son esencialmente las de la 
hidrodinamica clasica. 


2. Los potentiates y las juersas gravitacionales 

Introduzcamos el tensor simetrico h|j de los potenciales gravitacionales, 
definido por las ecuaciones: 

(7) □h | i = 8*T« 

en donde □ designa el operador de D’Alembert: 

_ # _£_ = so.. 

Ox 1 ) 2 (9x 2 ) 2 Ox 3 ) 2 Ox*) 2 3X» 9x b 

En el espacio vacio: T ,J = 0. La forma de estas ecuaciones esta de 
acuerdo con el espiritu de la Relatividad Restringida y las h ,J quedan defi- 


956 




14 


[Julio y Octubre 


niclas como potenciales retardados. Notese que las ecuaciones (7) no son 
de tipo reversible en el tiempo t = x l , puesto que un potencial retardado 
se transforma en un potencial avanzado cuando t cambia de signo. Como se 
ve, las ecuaciones (7) son del tipo de la ecuacion de Poisson: 



El tensor f* (o tambien el f|) queda definido por los dos postulados si- 
guientes: 

(1) Las fuerzas fi (fuerzas por unidad de masa) son funciones linea- 
les y homogeneas de las primeras derivadas de los potenciales hjj. 

(2) Las fuerzas fi son funciones pares y cuadraticas de la u* y sola- 
mente dependen de las primeras derivadas de los potenciales gravitacio- 
nales y de las velocidades. Notese que la condicion universal 

(8) f ft u* = 0 

muestra claramente que fi dcpcnde del tensor u‘. 

La forma de las f| se determina haciendo el siguiente razonamiento, 
que si bien no es completo, si es, en cambio, rapido y sencillo. Los tenso- 
res que intervicnen en la expresion de fi son: 


dlbj 

u 1 y -; 

dx' 

utilizamos aqui las componentes contravariantes del vector velocidad. Pa¬ 
ra formar el tensor mas general que satisfaga los postulados (1) y (2). 
introduzcamos las variables normalcs: 

.r 1 = x*. ^ .= \/"“i x 2 , x 3 = \/=T x 3 . jt 4 = \/=\ x 4 . 

Usaremos cursivas para todas las letras que representen a componen¬ 
tes de tensores en el sistema x l , x 2 , x 3 , x*. Las relaciones que bay entrc las 
componentes de un tensor contravariante simple en uno v otro sistema. son: 

u l = u 1 . ii 2 = V^T u 2 . » 3 = \/=T u 3 . if 4 = V=T m 4 - 


957 



de 1944] 


15 


Las ecuaciones de transformation de las componentes son las siguien- 
tes en el caso de un tensor doblemente covariante: 


All 


= a n ; a„ = — V—1 a,„ s ^ 1 


a.i = — V— 1 a *i» s ^ 1 

a n = —a r , r, s*M. 

El tensor /t de la fuerza gravitacional es entonces una combination 
lineal de los cuatro vectores: 




0/i« 




U* U b , 


"dh* b 


U a M b . 


vx* dx 1 dx b dx 1 

Sean A, B, C, D los cuatro coeficientes de la combinacion lineal; entonces: 


9/iio 

/. = A-+ B-+ C 


M* t« b -|- D - ,,Q ** b 


u“ u°. 


3 .r“ ax' 3*" 9*1 

De la condidon /i h 1 = 0 se obtiene inmediatamente: 
9/iu 
9*“ 

De esta ecuacion se deduce: 

A = B = C + D = 0. 

El tensor / esta definido entonces por: 


/ 9/iu 9/'..\ 

A-+ B-) « 

V 3*“ 7)x' / 


3*u 

I 4 . (C'+ D) -»“ u b *** = 0. 

3 x' / 3-r b 


(9) 


/ 3/lla 3^«b\ 

f, = I-1 II" ll b . 

V 3-v b 3*' / 


3-v b 3* 1 

Transformando las componentes de ft del sistema .v* al sistema x‘, se ob- 


tiene: 



( 10 ) 

/3ht. 

f.= I- 

\3x b 

0 h a b 



958 



16 


lJulio y Octubre 


Consideremos ahora el caso de una esfera de fluido perfecto en movi- 
miento lento, de modo que su velocidad sea despreciable en comparacion 
con la velocidad de la luz. En primera aproximacion se tiene, entonces : 

u 1 = 1; u 2 = u s = u 4 = 0; 

T u = T 22 = T 33 = T“ = i p; T*J = 0 (i =^= j). 

Con ayuda de la ecuacion (7), se obtiene: 

□ h” = 4 n Q (i = j);Dh« = 0 (i^=j). 

Para el caso estatico que se esta considerando, estas ecuaciones se redu- 
cen a: 

V 2 h” = - 4np (i = J) ; v 2 h” = 0 (i ^ j). 

Las soluciones de estas ecuaciones, son: 

h'J = /i (i = j); h ,J = 0 (i^=j), 

en donde h es el potencial Newtoniano correspondiente a la masa dada y 
calculado en unidades absolutas. 

Las fuerzas correspondientes al tensor h ,, l son: 

?)h 

f* = — f. = - (i = 2,3,4). 

dx* 

Por consiguente, un sistema de particulas, moviendose en el campo gravi- 
tacional de una masa en movimiento lento, obedecera en primera aproxi¬ 
macion las leyes de Newton. 

3. Los potenciales exactos de una esfera en reposo 
En el caso de simetria esferica completa y de un estado de reposo: 
u 1 = 1 , u 2 = u 3 = u 4 = 0, 
y en el espacio vacio valen las ecuaciones: 

□ h” = 0. 


959 



de 1944] 


17 


Como se trata de potenciales estaticos, las ultimas ecuaciones son equiva- 
lentes a: 

V 2 h” = 0. 

La solution mas general de estas ecuaciones, que tenga simetria esjerica 
y no solamente central, y que se anule al infinito, es: 

m' 3 

h‘J =- 

r 

en donde r es la distancia al centro. 

Para determinar las m* 1 basta considerar las h li en el interior de la esfera, 
en donde: 

T n = T 22 = T 33 = T 44 = i Q; 

T” = 0, 

m 

r 

Las componentcs doblemente covariantes del tensor gravitacional son iguu- 
les a sus componentes doblemente contravariantcs: 

h« = h IJ; 

Entonces: 

(11) h„ = h 22 = h 32 = h 14 = —; h„ = 0(i^j); 

r 

m es aqui la masa de la esfera. 

4. El cantpo gravitacional dc juercas de una esjera en reposo 
La ecuacion (10) se reduce, para i = 2 y utilizando la (11), a: 

dh 

f. = -uV>-[(u 1 ) 2 + (u 2 ) 2 + (u 3 ) 2 + (u«) 2 ]. 

0 x b 0x 2 


960 



18 


[Julio y octubre 


m 

Como h = » 

r 



mu 2 

r 3 


(x 2 u 2 + x 3 u 3 + x 4 u 4 ) 


2 

-1- — I (u 1 ) 2 + (u 2 ) 2 + (u 3 ) 2 + (u 4 ) 2 ]. 
r 3 

Reemplazamos x 2 , x 3 , x 4 por x, y, z, respectivamente, y u 2 , u 3 , u 4 por 


dx dy dz 



La fuerza en la direccion del eje de las x es, entonces: 

F‘ = f 2 = - f 2 

mx 2mx mxV 

(12) =-(x /2 + / 2 + z' 2 ) + —. 

r 3 r 3 r 2 


en donde designamos por medio de un acento la derivada con respecto a s. 
Las formulas para F y y F 1 se obtienen de un modo analogo. 

Para una particula (como un planeta en el campo gravitacional del 
sol) que se mueve en un piano, las ecuaciones del movimiento son: 


(13) 



mx 2mx 
r 3 r 3 


(x' 2 + / 2 ) + 


mxV 

r 2 



my 2my 

-(x' 2 + / 2 ) + 

r 3 r 3 


myV 

r 2 


No es necesario escribir la ecuacion analoga para t", porque esta ultima 
es una consecuencia de las ecuaciones (13), que son integrates como 
veremos en seguida. 


4 . Integration de las ecuaciones del movimiento 

Si multiplicamos la primera de las ecuaciones (13) por y y la se- 
gunda por x, y si restamos miembro a miembro, obtenemos una ecuacion 


961 




de 1944] 


19 


diferencial que se puede integrar de un modo inmediato; el resultado es 
el siguiente: 

log (x/ —yx') =-C. 

r 

Introduciendo coo(/rdenadas polares, se obtiene: 

dd 

(14) r a - = he-”- 0 . 

ds 

Aqui C es una constante superflua que definiremos dentro de un mo- 
mento. 

Multiplicando otra vez a la primera de las ecuaciones (13) por x' y 
a la segunda por y sumando miembro a miembro, se obtiene: 

mr' 

*' x" + /y" = — (x* + y*). 


Ha integral de esta ecuacion diferencial es: 

(15) x' 2 + y 72 = e 2 ( t + °> — 1. 

De las ecuaciones (14) y (15) se elimina s, y se obtiene la ecuacion dife¬ 
rencial geometrica de la trayectoria de una particular 


(16) 


h ‘ [(-£) ’ + *] - 


e 2 (mo + C) [ e 2 (mo + O) _ 


en donde u = 1/r. 

Integrando la ecuacion diferencial (16) obtenemos: 


(17.1) 


_ hdu _ 

^ e 2 (mu + C) [ e 2 (mu + C) _ 


h 2 u 2 ] 


Tambien s y t se pueden expresar por medio de u en la siguiente forma: 

c 


(17.2) 


s = 


( e mo + 

hu 2 


dd 


962 



20 


[Julio y octubre 


(17.3) 




s 


e 2 (ma + 0 ) 

-dd. 

hu 2 


5. Avance del perihelio de un planeta 


Las velocidades de los planetas son muy pequenas comparadas con 
la velocidad de la luz, y para sus Hneas de universo t y s difieren muy 
poco entre si. De la ecuacion (15) se ve inmediatamente que mu -f C es 
tambien pequena. Esta entonces justificado despreciar las potencias su- 
periores de mu + C en la ecuacion (16), y podemos escribir: 


(16') 


r/du\ 2 

I-1 -fu 2 =2[mu + C4-3(mu + C) 2 ]. 


En (16') los terminos despreciados son de sexto orden con respecto a las 
velocidades. Integrando (16') se obtiene: 


(18) 



S U2 

Ul 


_ hdu _ 

V 2 (mu + C) + 3 (mu + C) 2 — h 2 u 2 . 


Aqui e designa el avance del perihelio en cada revolucion planetaria, y ui 
y ii 2 son los valores minimo y maximo de u, correspondientes a los valo- 
res maximo y minimo de r, respectivamente. El valor exacto de la inte¬ 
gral es: 

2nh 3m 2 

— - = 2 k (1-b ...)• 

Vu 2 — 6m 2 u 2 

La trayectoria es una elipse que gira lentamcnte; por lo tanto: 

h = yma(l—e 2 ) 


en donde 2a es el eje de la elipse y e su excentricidad. Para e se obtiene 
entonces: 

6 Jim 

e =£-. 

a (l—e 2 ) 


(18') 


963 



de 1944J 


21 


Esta es precisamente la formula aproximada, de la Teoria de la Relativi- 
dad General de Einstein, para el corrimiento del perihelio de un planeta. 


6 . La curvatura de la luz 


Si se considera la luz como una onda electromagnetica, los rayos lu- 
minosos son siempre lineas rectas y la velocidad de propagation es igual 
a uno en todos los casos. Pero existe una enorme cantidad de fenomenos 
luminosos que solo se pueden explicar con ayuda del foton. Supongamos 
que el foton es una particula de masa muy pequena cuya trayectoria es 
el rayo real como en la teoria corpuscular de Newton. Establezcamos las 
ecuaciones de ese rayo. 

Consideremos la ecuacion (16) ; como la particula se mueve con una 
velocidad casi igual a uno, la constante C es muy grande. Introducicndo 
h*, definida por: 

h = h* e 2 “ c , 

se obtiene: 


(16") 



= e 2mu 


[e 2mu — e 


- 2 ,DIU • C) 


]■ 


Cuando C tiende al infinito la forma limite de (16") es: 


(19) 



e 4mu . 


Esta ecuacion define la trayectoria del foton. De la misma se obtiene inme- 
diatamente: 


( 20 ) 


2* -f- e = 



h*du 

\/e 4mu — h*u 2 


En esta ecuacion, p designa la distancia al perihelio, y 6 la desviacion an¬ 
gular del rayo. En el perihelio: 


du 

- = 0, y h *2 — p". 

dO 


964 



22 [Julio y octubre 

Substituyendo el segundo miembro por h* 2 y escribiendo u = v/p, la ecua- 
cion para e se transforma en: 


( 21 ) 



que es la formula mas comoda para la desviacion exacta. 

En el caso del Sol, m y m/p son muy pequenos, de manera que se 
puede uno conformar con los primeros terminos de la integration por 

serie. 

C dv 

e = 2 l - 1 

/o Vl-v s -4m(l-v)/p 
= 4 sen -1 i (1 -f ni/p)— n ss 4m/p. 


El resultado es en primera aproximacion: 


( 22 ) 


4m 




P 


que es identico al de Einstein. 


7. Corrimiento hacia el rojo de la luz del sol 

Surge ahora la pregunta: ^que resultado predira la nueva teoria so- 
bre la luz emitida desde el sol y observada en la tierra? 

Para poder resolver este problema es necesario formular mas clara- 
mente el concepto de foton. Pensamos que el foton es una particula cuya 
velocidad es un poco menor que uno, no siendo necesario precisar su 
diferencia con esta velocidad limite. En la superficie del sol el foton reco- 
rre una distancia dr en un intervalo de tiempo local: 

dr 

ds g = - 1 - —» 

y e 2(«n/r + C)_ 1 


965 



de 1944] 


23 


en donde r es el radio del sol y m es su masa. En la superficie de la tierra, 
la formula analoga es: 

dr 

dsr= - ■ > 

^ e 2(M/R + C)_l 

en la que M es la masa de la tierra y R su radio. 

La relacion de tiempos locales es: 


(23) 


ds. 

dsr 


V e 2(H/R + C)_l 
e 2 (n»/r + C)__ J 


1 — 




El significado de esta formula es que el foton recorre la misma distancia 
dr, en un intervalo de tiempo local menor en el sol que en la tierra. Si 
consideramos el corrimiento de las rayas espectrales como debido a la rela¬ 
cion de los tiempos locales ds,, y ds-r, entonces nuestra teoria predice una 
desviacion hacia el rojo, definida por (23). La formula correspondiente 
en la Teoria de la Relatividad General de Einstein es: 


(24) 


ds. 


ds T 


- v 


1 — 2m/r 
1 - 2M/R 



La Teoria de la Gravitation expuesta en este articulo explica satis- 
factoriamente los tres efectos cruciales de la Teoria de la Relatividad Ge¬ 
neral, utilizando un aparato matematico mas sencillo. Esta ventaja hace 
esperar que se puedan resolver en esta Teoria problemas que no ha sido 
posible atacar dentro de la propuesta por Einstein. 


966 



Reprinted from Tot Physical Review. Vol. 66. No*. 5 and 6. 138-143. September 1 and 15. 1944 

P.Uud U U. S. A. 


On Birkhoff’s New Theory of Gravitation 

A. Babajas. C. D. Bibkhopp. C. Cbarp. awd M. Sandoval Vallabta 
National University of Mexico. Mexico D. P., Mexico, and Harvard University. Cambridte. Massachusetts 

(Received May 4. 1944) 


It i. pointed out in the first place: (1) in Birkhoff *• 
gravitational theory based on • flat" .pace-time, the "red 
.hilt" is accounted (or by the energy chanfe of the photon 
. it travels from the emitting body, wherea. the photon 
plays no especial role in the Einstein theory; (2) the 
wlution ot the problem of two or more bodies i» feasible 
in the new theory because of its simpler character. Four 
comments of H. Weyl concerning the Birkhoff theory are 
discussed, and it is concluded that these are to be taken 
with much reserve. In regard to the third of that com- 
menu it is pointed out that the "perfect fluid" used by 
Birkhoff as the ultimate carrier of mass and electric charge 
is to be characterized as the simplest fluid with disturbance 
velocity that of light (c). It is affirmed to be a glaring 
defect of earlier relativistic theories that the disturbance 


velocity in matter has been uken as arbitrary, althoujf- 
that of gravitation and of the electromagnetic field have 
been equal to c. The differential equations of the theory are 
then set up. An additional cosmological term in the gravita¬ 
tional potentials A,, is suggested, namely. 

where x. y. 1.1 are Lorenu coordinates and K is the (small) 
cosmological constant. The explicit formula (or the rate 
of advance of periastron P of two bodies (mass poinu) of 
masses m and mi is given, as obtained from the solution of 
the two-body problem in the theory, and iu possible appli¬ 
cation to double stars is referred to. The author* propose 
to give a deuiled development of the theory and its appli¬ 
cations in papers to be published shortly elsewhere. 


L PRELIMINARY OBSERVATIONS 

I N a recent number of a journal of wide cir¬ 
culation,' Hermann Weyl has given expression 
to several critical remarks on G. D. Birkhoff’s 
theory of gravitation of 1942. 1 In this note we 
intend first to analyze briefly the substance of 
Weyl's comments, secondly to consider the 
structure of the new theory from the physical 


i H. Weyl. Math. Rev. 4. 285 (1943). 

• G. D. Birkhoff. Proc. Nat. Acad. Sci. 29. 231 (1943). 


point of view, and lastly to refer briefly to its 
physical applications. 

Before doing this, however, we would like to 
make certain general observations. 

The explanation of the “red shift" is funda¬ 
mentally different in the gravitational theories 
of Einstein and Birkhoff. In the new theory the 
red shift has turned out to be accounted for by 
the energy change of the photon as it travels 
from the emitting body to the Earth; this 
explanation fully takes account of the role of 


967 



139 BIRKHOFF’S NEW THEORY OF GRAVITATION 


the light signal (photon). In Einstein's inter¬ 
pretation, on the other hand, the frequency of 
the light wave emitted by an atom on a distant 
body is compared with the frequency of the 
light wave emitted by the same atom on the 
Earth, and any phenomena taking place while 
the light signal (photon) travels from the emitting 
body to the Earth play no part whatsoever. 

Again, the difficulties accompanying the two- 
body problem and other questions in the general 
theory of relativity arc too well known to require 
mention, while in Birkhoff's theory the solution 
of the problem of two or more bodies is quite 
within reach and will soon be available. This is 
due to the essentially simpler character of the 
latter theory. 

An objection which may be made to Birk- 
hoff’s theory is the introduction of an absolute 
reference system, which runs counter to Ein¬ 
stein's general relativity principle that matter 
determines space, and analogous philosophical 
ideas previously developed by Ernst Mach. But. 
in a sense, this objection to Birkhoff might also 
be urged against Einstein because in the latter's 
theory a single rotating body can still be sup¬ 
posed alone in the universe, which is absurd 
from Mach's point of view. It would be difficult 
indeed to set up any physical theory to which 
objections of this general nature could not be 
raised. 

D. ANALYSIS OF WEYL’S COMMENTS 

To begin with, Wcyl contends that Birkhoff’s 
theory is much the same as Einstein’s theory of 
1916, for the case of weak gravitational fields. 
In a note appearing elsewhere, Barajas' has 
shown that the factual consequences of Einstein’s 
theory in this case differ from Birkhoff's; further, 
that the choice of the gravitational potentials 
suggested by Wcyl is unsatisfactory; and. lastly, 
that the trajectories of a test particle in Birk¬ 
hoff’s theory are not geodesics : n any four¬ 
dimensional space-time. 

Wcyl further raises four fundamental objec¬ 
tions to Birkhoff’s theory: (1) "The connection 
between metric and gravitation is dissolved." 
(2) The proportionality between inertial and 
gravitational mass "has again become as mys- 


• A. Barajas. Proc. Nat. Acad. Sci. 30, S4 (1944). 


terious as it was before Einstein." (3) Birkhoff’s 
perfect fluid "appears as a primitive irreducible 
physical entity." (4) "There seems to be no 
indication that the mechanical equations spring 
from a universal law of conservation of energy 
and momentum." We now prn|>ose to take up 
these points in detail. 

(1) With regard to the first, it should be 
recalled that the connection between matter and 
geometry, as developed by Einstein, is purchased 
at the expense of giving up a fundamental 
reference system. This implies abandoning the 
description of nature in terms of four funda¬ 
mental independent variables, essentially unique 
except for the arbitrariness involved in position 
and velocity of a single point (Lorentz group). 

In the abandonment of such coordinates may be 
discerned a sufficient reason for the early ex¬ 
haustion of all observational tests of the general 
theory of relativity, and also for the fundamental 
difficulty of assigning actual physical meaning to 
the coordinates introduced in problems of this 
theory. Indeed, almost thirty years of intensive 
research have failed to provide another test 
besides the three classical ones, or to apply the 
theory to other fields of physics. In spite of a 
great deal of work, there does not exist any satis¬ 
factory solution of the two-body problem, and 
the n-body problem has not even been formu¬ 
lated. Even in the simple case of Schwarzschild's 
solution of the one-body problem, no clear-cut 
physical interpretation of the Schwarzschild 
coordinates seems to be available. This last dif¬ 
ficulty is so serious that it led Milne to give up 
the use of general coordinates in relativity. 

(2) Weyl's second objection seems to be based 
largely on a misinterpretation. In Birkhoff's 
theory, just as well as in Einstein's, the equality 
between inertial and gravitational mass comes 
out of the fact that in the gravitational equations 
the energy tensor is linear in the mass density. 
While nothing similar to Einstein’s "equivalence 
principle" has been explicitly formulated in 
Birkhoff’s theory, it must be stated that the 
exact significance of the principle has yet to be 
found. As far as can be seen at the present time 
it merely asserts that bodies moving freely in 
empty space near attracting matter behave, 
relatively to a hypothetical attached reference 
system, in the same way under all circumstances 


968 



140 


BARAJAS. BIRKHOFF. C 

—insofar as they so behave! From the point of 
view of differential geometry, it amounts to 
saying that in the infinitesimal neighborhood of 
any point in a Riemannian curved space there 
exists always a tangent flat space, and this is 
just as true in Einstein’s theory as in any other. 

According to Einstein, the proper language for 
the description of nature is that of tensors, to¬ 
gether with the underlying group of general 
transformations; and the equivalence principle 
expresses the theorem just stated. According to 
Birkhoff. the proper language for this description 
is that of four vectors, and the underlying space¬ 
time is everywhere flat. In the latter case, the 
basic group involves only ten arbitrary con- 
slants, while in the former it involves four com¬ 
pletely arbitrary functions of four completely 
arbitrary variables. Whichever choice is made, 
the equality between gravitational and inertial 
mass follows as a consequence both of Birkhoff's 
and Einstein’s theories. 

(3) Birkhoff. has chosen the name 'perfect 
fluid” for perhaps the simplest substance in which 
all disturbances arc propagated with the velocity 
of light, on account of the similarity between the 
equations governing such a substance and those 
of a perfect fluid. The particular type of perfect 
fluid considered by Birkhoff is characterized by 
only one scalar (the mass density) and vector 
(the velocity). Fundamental difficulties arise if 
the disturbance velocity is different from that 
of light. From our point of view, the fact that in 
Einstein's theory no attention has been paid to 
this requirement constitutes a glaring defect, 
which by itself explains the possibility of Birk¬ 
hoff's theory. Indeed, the postulation of a pri¬ 
mordial substance in which all disturbances are 
propagated with the velocity of light is funda¬ 
mentally a consequence of the assumption that 
the basic space-time is everywhere that of 
Minkowski. This is actually the only physical 
assumption made in Birkhoff’s theory. 

(4) Weyl’s fourth objection also seems to 
spring from a misunderstanding. In Birkhoff s 
theory the law of conservation of energy and 
momentum at material velocities small compared 
with that of light is as much a consequence of the 
mechanical equations of motion (and conversely) 
as in Einstein’s. It is true that in Birkhoff's case 
the conservation of energy and momentum is not 


RAEF. AND VALLARTA 

connected with a fundamental geometrical 
theorem (Ricci’s theorem), as in Einsteins 
theory. But. although Einstein obtains the 
formal result that the divergence of the energy- 
momentum tensor must.vanish, this does not 
imply the conservation of energy and momentum 
in an exact sense because the four-dimensional 
integrals of a covariant partial derivative in 
curved space-time cannot be transformed into 
three-dimensional integrals. Consequently the 
conservation theorems of Birkhoff's theory are 
at least much more precise. 

These remarks may serve to point out that 
Weyl’s assertions are to be taken with much 
reserve and that additional research is required 
before the usefulness of Birkhoff’s theory for 
physics can be adequately assessed. 

m. STRUCTURE OP THE BIRKHOFF THEORY 

In the light of the preceding remarks, it is 
possible to recapitulate as follows the considera¬ 
tions which lead to Birkhoff's theory. The funda- 
mcntal concept of electromagnetic space-time, 
associated with the names of Fitzgerald. Larmor. 
Lorentz. Einstein, and Minkowski, has been ot 
the first importance for physics and has played 
an ever increasing role in the developments of the 
last fifty years. The revolutionary change which 
this concept has brought about in physical theory 
is reflected in the mathematical apparatus em¬ 
ployed. Four homogeneous variables of space and 
time replace the three homogeneous variables of 
space and the single disparate variable of time; 
the underlying Lorentz group replaces the 
Galilean group, and the language of 4 vectors 
replaces the language of 3 vectors characteristic 
of classical physics. 

The early attempt of Nordstrom (1912) and 
others to incorporate gravitation in this frame¬ 
work failed to explain certain delicate gravita¬ 
tional phenomena which alone provide a crucial 
test. Thus, without an intensive study, electro¬ 
magnetic space-time was abandoned for a curved 
or Riemannian space-time, latent in the ideas of 
Minkowski and realized in Einstein's brilliant 
gravitational theory of 1916. 

It is exceedingly easy to exaggerate the sig¬ 
nificance of the three so-called critical confirma¬ 
tions of this theory, of which by far the most 


969 



141 


BIRKHOFFS NEW THEORY OF GRAVITATION 


certain is afforded by the excessive perihelial 
advance of the planet Mercury. In fact, the 
building of theories from the aesthetical-mathe¬ 
matical point of view has shown that dimensional 
considerations enter which lead always to the 
same result, aside from simple numerical factors. 
More definitely, the three formulas for perihelial 
advance, bending of light, and red shift take 
always the respective forms 

qm/a(l -«*). rm/p, jm(l/ri-l/r«) 

where q, r, s, are simple numerical constants. 

Consequently, the basic test of any such theory 
really reduces to the single requirement that the 
first constant r has a value not very different from 
Einstein’s value 6v! Thus the theory of Einstein, 
stripped of all mystical trappings, is seen in its 
proper perspective. It becomes obvious that the 
question "What is the simplest theory of gravita¬ 
tion and other physical phenomena, based on 


Table I. Table of disturbance velocities. 



Norton. 

Uimll 

"ear- 

BUkhoP 

Matter 

Arbitrary 

Arbitrary 

C 

Gravitation 


t 

t 

Electromagnetic field 

€ 

« 

l 


electromagnetic space-time, which explains the 
known facts?" deserves the most careful con¬ 
sideration. This question becomes all the more 
urgent since the Einstein theory, with its enor¬ 
mous mathematical complication and its lack of 
proper independent variables, seems to be essen¬ 
tially unworkable. 

Birkhoff's theory of 1942 provided perhaps the 
first thoroughgoing attempt to answer this ques¬ 
tion. Its point of departure arises from the valid 
criticism of earlier theories that the forms of 
matter therein employed are inconsistent with 
the actual requirement that the disturbance 
velocity be that of light. This situation may be 
roughly set forth in the comparative Table 1, 
in which the entries indicate disturbance velocity. 
The physical necessity for a disturbance velocity 
c of matter appears from the fact that, with any 
other velocity, the equations of motion break 
down at the collision of two atomic constituents. 
Here the matter referred to is not gross matter. 


but the ultimate carrier of mass and electric 
charge. 

If this requirement is admitted, it appears that 
the first condition of any new theory based on 
electromagnetic space-time should be the con¬ 
dition that matter has a disturbance velocity 
equal to that of light under all circumstances. 
The theory of Birkhoff is based on one such type, 
the "perfect fluid." which seems to be the sim¬ 
plest conceivable from a conceptual point of view. 
But it appears almost certain that any other 
type of medium obeying this same fundamental 
requirement will lead to essentially the same 
gravitational theory. The "perfect fluid" may be 
approached in the following manner: The state 
of matter is regarded as characterized by a single 
scalar density p and vector velocity u‘-dx‘/dr 
in the sense of local causation; the equations of 
motion are to be linear in the rates of change of 
these variables; free equilibrium is possible at a 
certain density p»; the disturbance velocity is to 
be that of light c, which becomes 1 if the light- 
second is the unit of distance. 

We have established the mathematical theorem 
that the equations of motion of the perfect fluid 
may be written in the 4-vector form 

divr-ar«-/ax--o. o, 

where g,<-l. —1, —I, -1 for »-l, 2, 3, 4, re¬ 
spectively. and JJ./-0 for iv*j, with appropriate 
choice of the scalar density p, unique up to a 
choice of a unit. Furthermore, the perfect fluid 
so obtained satisfies the conservation principle 
that the integral f f fVpdv is conserved (dp, 
element of volume referred to a rest system). 

The equation of motion under arbitrary force 
densities/ 4 may be written correctly 

ar^/ax--/ 4 . 

There is then the essential further requirement 
to be imposed on the forces/ 4 that if the particles 
of the fluid return to an initial state of position 
and velocities, the density p must also return. 
For this, it seems necessary and is certainly suf¬ 
ficient that the condition of orthogonality 

/•“* = 0 

be identically satisfied. Thus the force vectors 
are always required to be identically orthogonal 
to the velocity vectors. A primary force vector of 


970 


BARAJAS. BIRKHOFF. GRAEF. AND VALLARTA 


142 


this type is evidently the acceleration vector 

itself. .... 

Now all force vectors which have been used in 

previous relativistic theories are rational and 

integral in the components of the veloaty 

vector u*. It is therefore natural to set: 

f*~A t +B i .tf+C*u-u»+- • •. 

where the coefficients are functions of position 
alone. But the electromagnetic force is known to 
be linear and homogeneous in the velocities and 
also in the first partial derivatives of the electro¬ 
magnetic potential *»*. which itself is of dimension 
0 in the unit of length and time. Similarly, in the 
theory of Einstein, the gravitational forces are 
homogeneous and quadratic in the velocities, and 
linear and homogeneous in the first partial de¬ 
rivatives of the gravitational potentials g which 
again are of dimension 0. It is therefore natural 
to assume that B and C pertain to electromag¬ 
netic and gravitational forces, respectively. 
BirkhofI imposes the analogous requirements 
upon the vector A and thus arrives at the first 
(covariant) form for the forces* 


then for the general case, these equations arc 


dx- 


r+r. 



where f 91 is the gravitational force and where the 
gravitational potentials h, t are defined by 

Ov-feT.!. 


Such is the structure of the BirkhofT theory as 
hitherto formulated. However, to account for the 
phenomena of nebular recession on this basis, it 
is necessary to suppose that at some time in the 
past there was a nuclear distribution of matter, 
with a wide range of velocities. But from the 
physical point of view it appears to be more 
natural to suppose an initial nuclear distribution 
at low relative velocities. In our recent inves¬ 
tigations. we have been led to the following 
extension of the original theory involving the 
introduction of a cosmological constant. 

From the formal point of view the following 
more general type of Poisson equation needs 
to be especially examined: 

Qh„+oT„+bg.i. 



where / is his "atomic potential." constant along 
the world-lines of the fluid, is the electro¬ 
magnetic potential satisfying the Maxwell- 
LorcnU equations (*. density of electricity) 


d /d+< 

dxAdx* ax*/ 


— 4 rou. 


and where h it is his gravitational potential satis¬ 
fying 


a**./ a*A., iPh.i a*A„ 


which is the generalized form of Poisson's equa¬ 
tion of the theory. 

Hence the gravitational theory involved may 
be singled out as follows: If the equations of 
motion for a small total amount of matter 
(absence of gravitational force) are written 

ar-/ax--/* : 


With proper choice of a unit of density, it is then 
possible to specialize further the above equation 
to the form 

where K is the "cosmological constant," which 
has the dimensions of a density and is supposed 
to be very small. 

Now. with such an extended Poisson equation, 
it is no longer possible to demand that the 
gravitational potentials h„ approach 0 regularly 
at ®. nor even that these functions are linearly 
infinite. However, the condition may be imposed 
that h it are regularly infinite to at most the 
second order, with a boundary distribution at 
infinity which is spherically symmetric in a 
spatial sense. Obviously, the most natural pos¬ 
sibility from the electromagnetic point of view 
is to take the cosmological term in the gravita¬ 
tional potentials to be 

tit -* — (f* - ** - y* - **)«•/• 

8 


« BirkhofI has changed the form of A, from to 

pdilix* because of the dimensionality requirement. 


where x. y, 1 .1 are any Lorentz coordinates and 
x = y = z= I* Ois the origin in space-time. 


971 



143 


B I RKHOFF'S NEW THEORY OF GRAVITATION 


It is our purpose to publish shortly detailed 
studies of the theory outlined above, its cos¬ 
mological extension, and its applications. 

IV. APPLICATIONS AND TESTS 
OP THE THEORY 

It has not yet been possible for us to make an 
intensive and thoroughgoing study of the Birk- 
hoff theory in relation to known gravitational 
phenomena; and still less has it been possible for 
us to consider other possible applications. 

However, the basic two-body problem has 
already been solved by us to the requisite order 
of approximation. Thus, for instance, the formula 
for rate of advance of periastron P of two bodies 
(mass points) of masses mi and mi has been found 
to be (in absolute units) 

p 3mi*+7mimi+3mt* 2 t 

mi+m, a(l-e*) 

which in case of an infinitesimal mass (m,- 0 . 
mi-m) reduces to that made familiar by the 
Einstein theory, 6wm/a(l-«*). Here, of course, 
the application to the rotation of the line of the 
apsides for double stars is at once suggested. But 
such a comparison with observation must await 
the further analysis of the observations them¬ 
selves. so that appropriate data may be available 
in cases where the rotation of the line of the 
apsides due to tidal forces is relatively small. 

We here naturally hoped also that the new 
theory would account for some of the hitherto 
unexplained features of lunar motion. However, 
this docs not seem to be a likely outcome 
although it may turn out that the solution of the 
three-body problem (Earth, Sun. and Moon) 
will yield an explanation of the desired type. 
Instead, at the present writing, we are inclined 
to think that these supposed features are 
attributable to the large mass ratio (1/83) of 
Moon to Earth, which makes the convergency 


of the mathematical calculations involved slow 
and uncertain. 

It seems not unlikely that the astronomical 
consequences of this theory for the interpretation 
of the red shift from cxtragalactic sources will be 
of special interest. 

As yet no means appear to be available for a 
decisive experimental test between the gravita¬ 
tional theory of Birkhoff and that of Einstein. 
This theoretical uncertainty is likely to continue 
for some time, especially as it appears to be very 
difficult to carry the Einstein theory to specific 
conclusions other than those found by him at 
the outset. 

There remains for later consideration the study 
of atomic phenomena on the basis of the electro¬ 
magnetic and atomic potentials. It is to be hoped 
that, by proper assignment of the atomic poten¬ 
tial ^ to the constituents of the atom, the ex¬ 
planation of atomic phenomena may be advanced 
and possibly the fundamental Schrocdinger wave 
equation may be obtained in a conceptual 
manner.* Here the proton and electron consti¬ 
tuting the hydrogen atom, for instance, are con¬ 
ceived of as freely interpenetrable and super¬ 
imposed in case of equilibrium and as oscillating 
under disturbances. 

In conclusion we would like to point out that, 
for the physicist, all mathematics is funda¬ 
mentally a form of abstract model building, of 
more or less general aspects of nature; and that 
no experiments which the physicist may perform 
in his laboratory can advance very far without 
free access to a variety of abstract models which 
are not to be thought of as final. The new theory 
of matter, electricity, and gravitation in flat 
space-time proposed by Birkhoff. would seem to 
afford a model of unusually fundamental, simple, 
and complete type. 

* See G. D. Birkhoff'* two note* in the Proc. Nst. Acad. 
Sci. 13. 160, 165 (1927). 




972 



Reprinted from the Proceeding of the National Academy op Sciences. 
Vol. 30. No. 10. pp. 324-334. October. 1944 


FLA T SPA CE-TIME A ND GRA VITA TION 

By George D. Birkhopf 

Dbpartmbnt op Mathematics. Harvard University 
Communicated September 7, 1944 

If one admits that physical events take place in a 4-dimensional space- 
time continuum (an idea abandoned in current quantum-mechanical the¬ 
ory) there are three interesting possibilities: classical space and time; 
flat or electromagnetic space-time; curved space-time. The appropriate 
corresponding mathematical languages are, respectively, those of 3-vec¬ 
tors, 4-vectors and 4-tensors. 

In a certain sense, the flat space-time, characteristic of the so-called special 
theory of relativity, is just as absolute as classical space and time, since the 
coordinates t, x, y, z require exactly 10 arbitrary constants for their com¬ 
plete specification in both cases. But, in the framework of flat space-time, 
the.fundamental electromagnetic equations of Maxwell and Lorentz lose 
the artificiality which they possess in classical space and time. 

The initial attempts to incorporate gravitational phenomena in flat 
space-time were not satisfactory. Einstein turned to the curved space- 
time suggested by his principle of equivalence, and so constructed his gen¬ 
eral theory of relativity. The initial predictions, based on this celebrated 
theory of gravitation, were brilliantly confirmed. However, the theory has 


973 



Vol. 30. 1944 


PHYSICS: G. D. BIRKHOPP 


326 


not led to any further applications and, because of its complicated mathe¬ 
matical character, seems to be essentially unworkable. Thus curved space- 
time has come to be regarded by many as an auxiliary construct (Larmor) 
rather than as*a physical reality. 

In my opinion, the failure of the early attempts by Nordstrdm and 
others to develop a theory of gravitation in flat space-time is to be at¬ 
tributed to the fact that a fundamental theoretic requirement was over¬ 
looked, namely, that the disturbance velocity in matter must be that of 
light. 

With this requirement in mind, I have recently been led to a very simple 
theory of gravitation in flat space-time, concordant with all known gravi¬ 
tational phenomena and free of arbitrary constants. This theory was pre¬ 
sented first in very brief form in 1942 at Tonanzintla, Mexico, and has been 
developed further in a Note in these Proceedings . 1 Furthermore, atten¬ 
tion is to be directed to a Note by A. Barajas,* called forth by a review of 
H. Weyl, and to an article by A. Barajas, C. Graef, M. Sandoval Vallarta 
and myself, taking up the new theory from the physical point of view.* A 
very significant application of the theory to the two-body problem by Graef 
will be published shortly. 4 

Unfortunately, the foundation of the theory has not so far been ade¬ 
quately presented in its philosophic, postulational and mathematical as¬ 
pects. My colleague, Professor Barajas, and I are planning to publish an 
extensive article remedying this deficiency. 

The aim of the present Note is to present briefly these foundational con¬ 
siderations as I see them. It is especially necessary to do so in order to 
avoid further misunderstanding of the new theory. For example, Weyl 
says very recently, 6 referring to my theory: "Their [i.e., 'the field equa¬ 
tions’] most general static centrally symmetric solution involves 3 arbitrary 
constants a,b,l. . . . From the present standpoint this is a serious disadvan¬ 
tage of B.” His assertion is wrong since the general exact solution for the 
gravitational potentials h tJ is 

m 

ht, - «u7 

where r stands for radial distance, is the familiar Kronecker 3, 
and the mass m is the single arbitrary constant which enters. This 
exact solution plays in my theory a rdle analogous to that of the 
Schwarzschild solution in the theory of Einstein. Weyl has overlooked the 
salient fact that the central body is composed of the basic "perfect fluid.” 

The proposed theory of gravitation in flat space-time may be character¬ 
ized as follows in its fundamental features: 

1. In the 4-dimensional framework of flat space-time, matter is regarded 
as the occupant of certain tubular regions made up of the workilines of 


974 



326 


PHYSICS: G. D. BIRKHOPP 


Pkoc. N. A. S. 


identifiable points. Point-particles are abandoned once and for all. except 
as a limiting possibility. 

Thus there is a duality of matter and space-time in my theory, whereas 
the monistic concept of space-time conditioned by matter prevails in Ein¬ 
stein’s theory. Whichever point of view is destined to figure in ultimate 
physical theory, it seems the part of obvious common sense to explore both 

possibilities fully. . . _ A _ 

The simplest available form of matter in flat space-time is that charac¬ 
terized by a certain stream 4-vector v ' of space type, i.e., with 

p* = ( V »)i - (t/’)* - (»»)* - ( v*)' > 0; 

here p is the scalar length of the 4-vector v*. Or. alternatively, we may 
think of matter of this kind as characterized by a density p and a velocity 
4 -vector u\ where v' = pu' and 


(I* 1 )* - (*’)’ - (**)’ “ (* 4 )’ - 1- 

The principle of local causation is taken to hold for an isolated portion 
of this kind of matter, as embodied in the differential equations: 


d/ 





where / = x \ x - x 7 , y = x\ z = x 4 , and where F* are taken to be rational 
and integral in the partial derivatives involved. These four differential 
equations assert that the time rates of change of the components of the 
stream vector are functions of these components and their space rates of 
change, being rational and integral in the latter. 

The requirements of the underlying language of 4-vectors, based on the 
Lorentz group, then indicate as the only available possibility, when referred 
to instantaneous rest coordinates (t» l = p,v 7 = v* = v* = 0): 



dv* r( xdi ' 1 

aT - G(p) ^' 


(i = 2, 3, 4). 


These equations involve a pair of arbitrary functions F(p), G(p). 

By appropriate normalization of the scalar p (that is by multiplicative 
modification of the 4-vector o«) the preceding equations may be expressed 
in the normal form 

*12 = 0 (T M = vV - P(p)g il ) 
dx° 

where g 11 = -g” = -g» = -g u = 1 .£* = 0 for i 4= j. Here there ap- 


975 



Vol. 30, 1944 


PHYSICS: C. D. BIRKHOFF 


327 


pears only one arbitrary function p(p). This normalized density p is de¬ 
termined up to a unit of magnitude, while the scalar p(p) is determined 
up to an additive constant. 

2. It is granted that free equilibrium is possible for a certain equilibrium 
density, p 0 >0, i.e., that a possible state is v' = Vo provided only that we 
have 

(vo 1 ) 7 - W) 7 - W) 7 - (vo 4 ) 7 = po*. 

Because of the thoroughgoing analogy of the equations obtained with those 
of a homogeneous adiabatic fluid in classical physics, it is natural to assume 
that along a free boundary we have always p = po, and furthermore that at 
collision p takes on the same value on both sides of the common boundary 
until separation occurs for p = po subsequently. 

At this stage the behavior of a collection of freely moving portions of this 
"fluid” has been completely specified, whether or not collisions occur. The 
equations involved present a more familiar aspect if the velocities u f are 
introduced, with 


T ,J = pi#V — p(p)g fi . 


This type of equations has always been taken to be appropriate for a gen¬ 
eral homogeneous adiabatic fluid in flat space-time. The symmetric tensor 
T u has been called the energy tensor; p, the density; and p(p), the pres¬ 
sure. 

3. Such a fluid has the property that a certain divergence vanishes: 



-St*/* 


v a ) 


0 


This ensures that the 3-dimensional integral 

fpe-W’dv 

over the rest-volume is invariable. 6 

Keeping in mind the hydrodynamic analogy, it appears to be absolutely 
essential to suppose that this divergence vanishes under all conditions. 
Otherwise the fluid might undergo a full cyclic return to a set of initial 
velocities without a cyclic return of densities, such as is always observed 
to occur with ordinary matter. This requirement means that we have al¬ 
ways 

v >S? vsa - 

4. There is, however, a fundamental theoretic difficulty in the theoretic 
employment of the general adiabatic fluid. In fact if two portions of the 
fluid collide with oppositely directed velocities nearly equal to that of light 


976 



328 


PHYSICS: C. D. B1RKH0PF 


Proc. N. A. S. 


nv the equations of motion will break down if the disturbance velocity of 
the fluid is less than that of light. Since it is physically inadmissible that 
this velocity v (relative to rest coordinates) exceed that of light, we are led 
to require that this velocity, namely, 

v = V dp/ (dp — dp), 

equals that of light, so that at all densities dp/dp = ‘A- 

Thus we obtain by integration the only physically allowable constitu¬ 
tive equation p = p/2. The corresponding special form of the general adi¬ 
abatic fluid is called the perfect flu id. 

As far as this determination of T l> is concerned, we might nave written 
more generally p = 'Ap + c, and so have obtained an additional term of 
the form -eg" in T°. This modification would not affect the equations of 
motion, however. Reason will be given later on for the special choice made 
of c = 0 in fixing completely the energy tensor T u . 

For the perfect fluid the invariable integral reduces to the simple form 

fy/'pdv 


where dv is the 3-dimensional rest volume. 

The perfect fluid may be looked upon as the counterpart in flat space- 
time of the homogeneous adiabatic incompressible fluid in classical space 
and time, which has infinite disturbance velocity. Physically speaking, the 
perfect fluid is very nearly incompressible and thus possesses very nearly 
invariable mass J'pdv. 

If electricity of density a be attached to the perfect fluid, the ratio 
a/y/i, called the substance coefficient, remains forever constant along any 

worldline. . . - 

In what follows the perfect fluid is regarded as the single primordial form 

of matter. 

5. Suppose now that the perfect fluid, with energy tensor 

T u - - »/*(*. v m )g if 0) 


is not free but is subject to body forces. Formally, the suggestion is obvious 
that we define a force vector/' by means of 


dx- 1 


( 2 ) 


where because of the postulated invariance of the integral 

J'Vfdv 


(3) 



977 


we have 


(4) 



Veil.. 30. 1944 


PHYSICS: G. D. BIRKHOFF 


329 


Thus the force vector is required to be identically orthogonal to the 
stream (or velocity) vector. It may be recalled that the acceleration vector 
a 1 along any worldline has the same property. 

In view of the identity (4) it is clear that/ 1 cannot be independent of the 
vector v*. For the case of electrically charged matter f* is known to be 
linear homogeneous in the 4-vector t>‘ and identically orthogonal to it. For 
the gravitational forces it is the simplest possible hypothesis to assume that 
in a purely gravitational field f is homogeneous and quadratic in v ‘ and 
proportional to p. We write therefore in that case 

F = w>V (5) 

where the components of the tensor >e) k = >p kJ are functions of t, x, y, z de¬ 
fined throughout space-time. 

Without going into any detail it is to be stressed that this is the most 
natural assumption which can be made about the manner in which /' de¬ 
pends on the stream vector v 1 . This is especially the case in view of the re¬ 
versibility of gravitational phenomena in time. 

6. At this stage of our genetic account of the theory under consideration, 
in full accordance with the tradition of the past, it will be supposed to begin 
with that non-gravitational forces, such as electrical forces, need not be 
considered in gravitational problems. This hypothesis is justified by the 
more complete form of the theory given in the Note already alluded to. 1 

For the case of a rest system the x, y, z components of the gravitational 
forces take the form 

5* — P*ii» fw = P'Au St = PV’ii (5 # ) 

with ft — 0 of course. 

In the corresponding Newtonian situation the force components are: 


/. = pj* 

ox 


f, = p* 
Oy 


bz 


where g is the Newtonian potential defined by Poisson’s equation 


Vg.Vg.Vg 

bx* by * bz 2 


— 4rp, or 0 in empty space, 


together with the requirement that g is finite (or vanishes) at infinity. 

By formal analogy one is led to require that ip) k are linear homogeneous 
in the first derivatives of the gravitational potential defined by means of an 


appropriate Poisson equation. 

A very simple possibility would be to set up a scalar gravitational poten¬ 
tial h defined by 

b 2 h b'h b*h b'h 0 „ n . 

= 8xT ’ or 0 ,n empty space> 


978 




330 


PHYSICS: C. D. BIRKHOFF 


Proc. N. A. S 


where we have written 


T = gafiT a9 '= p/. 


for the contracted energy tensor. However, with this form of f it is not 
possible to fulfil the vital condition of orthogonality, embodied in (4). 

The alternative, equally simple, hypothesis from the formal point of view 
is to take as the analog of the Poisson equation 

[S - S - $ - £] - * rT * or 0 in empty space ' (6) 

with the additional requirement that h ti are finite (or vanish) at infinity. 
The gravitational potential h„ thus introduced is a symmetric tensor of the 
second order. No essential limitation is introduced by use of the special 
numerical factor 8* on the right. 

It is now readily seen that the same condition of orthogonality (4) leads 
to the following uniquely determined form for the gravitational force vec¬ 
tor/ 1 : 



The theory has now been completely formulated Jor the limited form in which 
only gravitational forces enter. It may be extended so as to include electrical 
and atomic forces in a natural and interesting manner (see reference 1 and 
Section 11 below). In the limited but important form now under consider¬ 
ation, with use of the usual tensorial subscript and superscript notation, the 
theory is embodied in the pair of equations 



*11 - 

&h ti 

dx-dx* 


iy (Mya _ 

g \Zx* ' 

= 8tcT ij, or 0 in empty space, 


( 8 ) 


where T” is given by equation (1). 

At this stage it is clear why it was natural to take the undetermined con¬ 
stant c in T li to be 0. Otherwise we should have had 

T” = vV- »/* (v a v a )g” - eg” = p(m‘u # - Vtf”) ~ eg” 

and we should obtain as the Poisson equation for the limiting case p = 0, 
inside of the fluid, 


which seems inappropriate unless c is 0. 


979 



Vol. 30. 1944 


PHYSICS: G. D. BIRKHOFF 


331 


On the basis of the theory thus obtained it is found that isolated bodies 
have a static, centrally symmetric distribution, and that a collection of 
spherical bodies move with very high approximation according to the 
Newtonian law of gravitation, relative to a convenient rest system. 

7. Let us now turn to the case of a static, centrally symmetric distribu¬ 
tion of the perfect fluid. It is not difficult to determine the radial distri¬ 
bution of the density p, but for our present purposes it suffices to observe 
that we have precisely (r, radial distance), 

T 1 * = V* P(r)6 tf . 

Our extended Poisson equation takes the form 

+ ITT + = ” 4wp(r)$u, or 0 in empty space. 

bx 2 by 2 bz 2 

Here, of course, the boundary of the (spherical) distribution occurs for some 
value ro of r where p(ro) = po- Thus we find as the exact solution for the cen¬ 
trally symmetric case 


= 7 

in empty space (r > r 0 ). involving, as the single arbitrary constant, the 
mass m of the fluid. 

In this way we obtain the gravitational potentials around a static, 
centrally symmetric body like the Sun. These potentials are not observ¬ 
ably affected by random atomic motion. 

8. Now consider a comparatively small approximate sphere of the per¬ 
fect fluid, forming in the limit a kind of .ideal particle of mass 0. First, we 
have essentially p = po throughout, so that we may write 

bT a bu'u* m bu* 

bx bx a bx a 


with high approximation, since (4) yields with similar approximation 


bu a 

bx a 


0 . 


But we have 


a bu 1 du 1 dV 

U bx a ds ds 2 


(i ds 2 = dt 2 - dx 2 - dy x - dz 2 ) 


along the worldline of the' ideal particle, so that we may write 


980 



332 PHYSICS: C. D. BIRKHOFF PR°c. N. A- S. 

d*x t>T’“ d'y ar^ _ 

a? = P °d?‘ &F - “W dx- P °ds*' 

Thus there are obtained the differential equations of motion for a com¬ 
paratively small body which moves in the field of a larger central body. 
These lead to essentially the same result for the perihelial advance of a 
planet and for the bending of a light photon in the field of the Sun as does 
the theory of Einstein. 

Furthermore, if we assume that the Planck formula 

E - hv 

determines the frequency of radiation v, the result of Einstein for the 
“red shift" in light reaching the Earth from the Sun is obtained. Inas¬ 
much as the precise mechanism of radiation is unknown, it seems more cor¬ 
rect to employ this basic formula, than to give an explanation in which 
the light-carrying photon plays no r61e, such as is afforded by the Einstein 

th 9 0r> A real test of the availability of the new theory in other directions 
is afforded by the problem of two or more bodies. 

As a first approximation to this problem, it is natural to consider the 
limiting case of n "ideal particles" of masses m u ”** • • • respectively, 

obtained by taking the equilibrium density po to be very large. It is 
clear that in the neighborhood of each particle P i of mass m„ the corre¬ 
sponding gravitational potentials should have a principal part 



in instantaneous rest coordinates. 

Graef has already shown (see references 3 and 4) that the calculations 
involved can be effectively carried out in the case of two bodies of masses 
m and m 2 . Presumably his method can be extended to the case of more 
than two bodies. Furthermore it should be possible to investigate in a 
similar spirit all of the fine-structure corrections to the Newtonian theory 

which lie within the limits of observation. 

10. In order to generalize the theory so as to admit cosmological terms 

one has only to write the Poisson equation in the form 


r - — 

Le >/ 2 dx 2 


$ - S}” = SirT " +Kg,> 


where K is the small cosmological constant. This really means that we 
allow a form of energy tensor T” containing the term - eg” previously indi¬ 
cated, with c =-^/8x small but not 0. 

The conditions at » have then clearly to be lightened to the form that 


981 



Vol. 30. 1944 


PHYSICS: G. D. BIRKHOFP 


333 


h tf becomes regularly infinite to the second order at infinity, in which case 
it is concluded that for a uniquely determined space-time origin we have 

h t , = h*j + ^(/ 2 - x* - y 1 - z 2 )g„ 

where h*j satisfy the previously indicated form of the Poisson equation and 
boundary conditions. In this case there is obtained an expanding uni¬ 
verse, about the space-time origin t=x=y=z= Oin the flat space-time 
under consideration. 

11. In the general case there is an electromagnetic 4-potential <p t satis¬ 
fying Maxwell’s equations, and an atomic potential ^ constant along every 
worldline. This leads to the complete theory with (covariant) force com¬ 
ponents: 



where the terms corresponding to atomic, electric and gravitational forces 
are homogeneous of degrees 0, 1 and 2, respectively, in the velocity vector, 
and are linear homogeneous in the first partial derivatives of the corre¬ 
sponding potentials \f/, tp* and h u t respectively. 

It will require further mathematical investigation in order to determine 
the serviceability of this conjectural theory in the domain of atomic phys¬ 
ics. I have previously indicated how an equation much like the Schrddinger 
wave equation may be obtained on the basis of the atomic potential 
However, my attempt was based on a background of curved space-time, 
and I had not then realized that the potentials must all be of zero dimensions, 
so that I used d^/dx‘ in place of pd^/dx*. 

12. Professors Barajas, Graef, Sandoval Vallarta and I are examining 
further these and other physical problems in the light of the new theory. 
Meanwhile, it seems clear that the theory promises well and deserves 
careful study because of its striking simplicity, completeness and mathe¬ 
matical consistency. Furthermore, as will be seen from what precedes, 
the theory is independent of all ideas of curved space-time and of the 
corresponding Einstein theory. 

No doubt, in view of the substantial successes of the Einstein theory, it 
is worth while to attempt to reflect that theory on to flat space-time, and so 
to obtain a degenerate theory, which in a certain sense is only the shadow 
of a shadow. However, the objections made by Barajas 2 to the form of de¬ 
generate theory considered by Weyl 5 seem to be substantially valid. 9 

As far as I can see, the Einstein principle of equivalence (“that inertia 
and gravitation are one,” Weyl, loc. cit.) is at bottom a mathematical 
principle signifying only that certain equations in the Einstein tlieory are 

of matter—a fact just as true of the 


982 




334 


PHYSICS: C. D. BIRKHOFF 


Pboc. N. A. S. 


theory of gravitation here proposed as of the general theory of relativity. 
To me there is only the following mathematical fact in the comparison of 
the basic points of view of the Einstein theory and my own: in his theory 
there is no underlying framework of independent variables t, x, y, z valid 
throughout space-time, such as are present in the theory based on flat 
space-time. The real question is whether or not the theory based on such 
special coordinates is simpler and more useful for the description and pre¬ 
diction of the physical facts. This is not a question to be decided by a 
priori considerations. What is required is rather a study of the new theory 
and its physical applications. 


» Birkhoff. G. D., "Matter, Electricity and Gravitation in Flat Space-Time." these 
Proceeding's. 29, 231 (1943). My lecture at Tonanzintla is about to appear under the 
title "El Concepto de Tiempo y la Gravitacion.'* Bolelin de la Sociedad MaUmdtica 
Mexicana, 1, No. 4 (1944). 

, Barajas. A., "BirkhofTs Theory of Gravitation and Einstein's for Weak Fields, 
these Proceedings, 30, 54 (1944). 

• Barajas. A., Birkhoff. G. D., Gracf. C., and Sandoval Vallarta, M.. "On Birkhoff s 
New Theory of Gravitation." Physical Review. 66, 138 (1944). 

« In vol. 1, No. 5 (1944) of the BoUtin de la Sociedad MaUmdtica Mexicana. 

• Weyl. H.. "Comparison of a Degenerate Form of Einstein's with BirkhofTs Theory of 
Gravitation," these Proceedings. 30, 205 (1944). 

• Sec for instance. Birkhoff. G. D.. Relativity and Modern Physics. Chaps. VII and XI, 
Cambridge, 1923, 1927. 

t xhe velocity of light c is 1 since the lightsecond is regarded as the unit of distance. 

• Cf. two notes in these Proceedings. 13, 160, 165 (1927). 

• In his Note Barajas is more than fair to the “degenerate theory”, which, strictly 
speaking, is no more usable than is the early Nordstrom theory. See M. Wyman. Math. 
Rev. 5. 218 (1944). Since all of the relativistic theories of gravitation take the classical 
Newtonian theory as prototype, the formal resemblance between them is inevitably 
considerable. This fact is stressed, for example, in my article, Newtonian and Other 
Forms of Gravitational Theory. Scientific Monthly. 58, 49 and 136 (1944). It is to be 
looked upon as the source of the formal resemblance between Einstein's general theory 
of relativity, based on curved space-time, and my own theory, based on flat space-time, 
which Weyl refers to. 


983 



